OCR1.0图像预处理部分
概述:
源于最近的大作业,老师提供的数据集有点太大了而且不太方便处理,所以分享几点对策,有问题欢迎留言
正文
数据集结构介绍
接近八个G的手写数字黑白图片,尺寸大致在(50,50)到(100,100),有背景田字格,椒盐噪声等
处理过程
除噪
高斯滤波应该也可以,但由于鄙人实在不熟悉python的gauss是个什么鬼,只好采用先膨胀然后腐蚀的办法进行除噪
def erzhihua(path,path1):
img = cv2.imread(path+path1,0)
r=1
s = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(2*r+1,2*r+1))
img = cv2.dilate(img,s)
img = cv2.erode(img,s)
更改尺寸
简单的resize统一改成50*50
dim=(50,50)
img1=cv2.resize(img,dim)
二值化
求出深度均值然后乘0.8(看心情调)作为阈值进行二值化
row,col=img1.shape
deep=0.1
for x in range(row):
for y in range(col):
deep+=img1[x][y]
deep/=row*col
deep*=0.8
for x in range(row):
for y in range(col):
if deep>img1[x][y]:
img1[x][y]=0
else:
img1[x][y]=255
最后上一手全部代码,处理后的数据集若有需要可留言(当然这段代码跑个半小时也可以得到)
from cv2 import cv2
import numpy as np
import os
path="/home/jojo/jason/jason_opencv/char/test/"
patha="/home/jojo/jason/jason_opencv/char1/test/"
def erzhihua(path,path1):
img = cv2.imread(path+path1,0)
r=1
s = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(2*r+1,2*r+1))
img = cv2.dilate(img,s)
img = cv2.erode(img,s)
dim=(50,50)
img1=cv2.resize(img,dim)
row,col=img1.shape
deep=0.1
for x in range(row):
for y in range(col):
deep+=img1[x][y]
deep/=row*col
deep*=0.8
for x in range(row):
for y in range(col):
if deep>img1[x][y]:
img1[x][y]=0
else:
img1[x][y]=255
cv2.imshow("1",img1)
#cv2.waitKey(0)
cv2.imwrite(patha+path1,img1)
return 0
for child_dir in os.listdir(path):
for child_dir2 in os.listdir(path+child_dir+'/'):
erzhihua(path,child_dir+'/'+child_dir2)
print(child_dir)
#print(img1.shape)
cv2.waitKey(0)
运行结果对比
处理前
处理后