制作OCR数据集

最新推荐文章于 2024-06-05 14:49:34 发布

海盗pk武龙

最新推荐文章于 2024-06-05 14:49:34 发布

阅读量3.7k

点赞数

本文链接：https://blog.csdn.net/weixin_38076506/article/details/85317027

版权

参考自链接：https://blog.csdn.net/meyh0x5vDTk48P2/article/details/79848753

手写数据集：http://www.nlpr.ia.ac.cn/databases/handwriting/Offline_database.html

确定你要生成多少字体，生成一个记录着汉字与label的对应表。
确定和收集需要用到的字体文件。
生成字体图像，存储在规定的目录下。
适当的数据增强

第三步的生成字体图像最为重要，如果仅仅是生成很正规的文字，那么用这个正规文字集去训练模型，第一图像数目有点少，第二模型泛化能力比较差，所以我们需要对字体图像做大量的图像处理工作，以增大我们的印刷体文字数据集。

总结了一下，我们可以做的一些图像增强工作有这些：

文字扭曲
背景噪声（椒盐）
文字位置（设置文字的中心点）
笔画粘连（膨胀来模拟）
笔画断裂（腐蚀来模拟）
文字倾斜（文字旋转）
多种字体

噪点：

def add.noise(cls,img):
    for i in range(20):
        temp_x = np.random.randint(0,img.shape[0])
        temp_y = np.random.randint(0,img.shape[1])
        img[temp_x][temp_y] = 255
    retrun img

腐蚀

def add_erode(cls,img):
    kernel =cv2.getStructuringElement(cv2.MORPH_RECT,(3,3))
    img = cv2.erode(img,kernel)
    return img

膨胀

def add_dilate(cls,img):
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(3,3))
    img = cv2.dilate(img,kernel)
    return img

随机扰动

def do(self,img_list=[]):
    aug_list = copy.deepcopy(img_list)
    for i in range(len(img_list):
        im = img_list[i]
        if self.noise and random.random()<0.5:
            im = self.add_noise(im)
        if self.dilate and random.random()<0.25:
            im = self.add_dilate(im)
        if self.erode and random.random()<0.25:
            im = self.add_erode(im)
        aug_list.append(im)
    return aug_list

海盗pk武龙

关注

0
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
制作OCR数据集

参考自链接：https://blog.csdn.net/meyh0x5vDTk48P2/article/details/79848753手写数据集：http://www.nlpr.ia.ac.cn/databases/handwriting/Offline_database.html 确定你要生成多少字体，生成一个记录着汉字与label的对应表。确定和收集需要用到的字体文件。...
复制链接

扫一扫