TACo:一种关于文字识别的数据增强技术

1. 介绍

TACo是一种数据增强技术,通过横向或纵向污损来对原图进行污损,以提高模型的普适性。污损类型有[randon, black, white, mean]四种形式,污损方向有[vertical, horizontal]

源代码地址:https://github.com/kartikgill/taco-box

2. 示意图

(1)原图:
在这里插入图片描述
(2)污损后的图片
在这里插入图片描述

3. 污损步骤(以vertical、randon为例)

Step1: 先判断输入图像是否是二维的灰度图,因为只针对2维灰度图进行污损;

        if len(image.shape) < 2 or len(image.shape) > 3:    # 确保是2维的灰度输入图像
            raise Exception("Input image with Invalid Shape!")

        if len(image.shape) == 3:
            raise Exception("Only Gray Scale Images are supported!")

Step2: 然后再在预设的单片最小污损宽度和最大污损宽度之间随机选取一个数,最为污损宽度;

       if orientation =='vertical':
            tiles = []
            start = 0
            tile_width = random.randint(min_tw, max_tw)

Step3: 再根据确定的污损宽度对原图进行切片,并根据预设的污损概率判断是否污损该切片;

          while start < (img_w - 1):
                tile = image[:, start:start+min(img_w-start-1, tile_width)]
                if random.random() <= self.corruption_probability_vertical:     # 如果随机数 < 预设的概率值,则进行污损
                    tile = self._corrupted_tile(tile, corruption_type)
                tiles.append(tile)
                start = start + tile_width

Step4: 拼接各切片并返回该合成图片(即增强后的图片)

       augmented_image = np.hstack(tiles)

4. 源码

import matplotlib.pyplot as plt
import random
import numpy as np


class Taco:
    def __init__(self,
                cp_vertical=0.25,
                cp_horizontal=0.25,
                max_tw_vertical=100,
                min_tw_vertical=20,
                max_tw_horizontal=50,
                min_tw_horizontal=10
                ):
        """
        -: Creating Taco object and setting up parameters:-

        -------Arguments--------
        :cp_vertical:        corruption probability of vertical tiles       垂直切片的无损概率
        :cp_horizontal:      corruption probability for horizontal tiles    水平切片的无损概率
        :max_tw_vertical:    maximum possible tile width for vertical tiles in pixels   垂直平铺的最大可能平铺宽度(像素)
        :min_tw_vertical:    minimum tile width for vertical tiles in pixels            垂直平铺的最小平铺宽度(像素)
        :max_tw_horizontal:  maximum possible tile width for horizontal tiles in pixels 水平平铺的最大可能平铺宽度(像素)
        :min_tw_horizontal:  minimum tile width for horizontal tiles in pixels          水平平铺的最小平铺宽度(像素)

        """
        self.corruption_probability_vertical = cp_vertical
        self.corruption_probability_horizontal = cp_horizontal
        self.max_tile_width_vertical = max_tw_vertical
        self.min_tile_width_vertical = min_tw_vertical
        self.max_tile_width_horizontal = max_tw_horizontal
        self.min_tile_width_horizontal = min_tw_horizontal

    def apply_vertical_taco(self, image, corruption_type='random'):
        """
        Only applies taco augmentations in vertical direction.
        Default corruption type is 'random', other supported types are [black, white, mean].

        -------Arguments-------
        :image:            A gray scaled input image that needs to be augmented. 需要增强的 灰度 输入图像。
        :corruption_type:  Type of corruption needs to be applied [one of- black, white, random or mean]

        -------Returns--------
        A TACO augmented image. 返回增强图像

        """
        if len(image.shape) < 2 or len(image.shape) > 3:    # 确保是2维的灰度输入图像
            raise Exception("Input image with Invalid Shape!")

        if len(image.shape) == 3:
            raise Exception("Only Gray Scale Images are supported!")

        img_h, img_w = image.shape[0], image.shape[1]

        image = self._do_taco(image, img_h, img_w,
                                        self.min_tile_width_vertical,
                                        self.max_tile_width_vertical,
                                        orientation='vertical',
                                        corruption_type=corruption_type)

        return image

    def apply_horizontal_taco(self, image, corruption_type='random'):
        """
        Only applies taco augmentations in horizontal direction.
        Default corruption type is 'random', other supported types are [black, white, mean].

        -------Arguments-------
        :image:            A gray scaled input image that needs to be augmented.
        :corruption_type:  Type of corruption needs to be applied [one of- black, white, random or mean]

        -------Returns--------
        A TACO augmented image.

        """
        if len(image.shape) < 2 or len(image.shape) > 3:
            raise Exception("Input image with Invalid Shape!")

        if len(image.shape) == 3:
            raise Exception("Only Gray Scale Images are supported!")

        img_h, img_w = image.shape[0], image.shape[1]

        image = self._do_taco(image, img_h, img_w,
                                        self.min_tile_width_horizontal,
                                        self.max_tile_width_horizontal,
                                        orientation='horizontal',
                                        corruption_type=corruption_type)

        return image

    def apply_taco(self, image, corruption_type='random'):
        """
        Applies taco augmentations in both directions (vertical and horizontal).
        Default corruption type is 'random', other supported types are [black, white, mean].

        -------Arguments-------
        :image:            A gray scaled input image that needs to be augmented.
        :corruption_type:  Type of corruption needs to be applied [one of- black, white, random or mean]

        -------Returns--------
        A TACO augmented image.

        """
        image = self.apply_vertical_taco(image, corruption_type)
        image = self.apply_horizontal_taco(image, corruption_type)

        return image

    def visualize(self, image, title='example_image'):
        """
        A function to display images with given title.
        """
        plt.figure(figsize=(5, 2))
        plt.imshow(image, cmap='gray')
        plt.title(title)
        plt.tight_layout()
        plt.show()

    def _do_taco(self, image, img_h, img_w, min_tw, max_tw, orientation, corruption_type):
        """
        apply taco algorithm on image and return augmented image.
        """
        if orientation =='vertical':
            tiles = []
            start = 0
            tile_width = random.randint(min_tw, max_tw)
            while start < (img_w - 1):
                tile = image[:, start:start+min(img_w-start-1, tile_width)]
                if random.random() <= self.corruption_probability_vertical:     # 如果随机数 < 预设的概率值,则进行污损
                    tile = self._corrupted_tile(tile, corruption_type)
                tiles.append(tile)
                start = start + tile_width
            augmented_image = np.hstack(tiles)
        else:
            tiles = []
            start = 0
            tile_width = random.randint(min_tw, max_tw)
            while start < (img_h - 1):
                tile = image[start:start+min(img_h-start-1,tile_width), :]
                if random.random() <= self.corruption_probability_vertical:
                    tile = self._corrupted_tile(tile, corruption_type)
                tiles.append(tile)
                start = start + tile_width
            augmented_image = np.vstack(tiles)
        return augmented_image

    def _corrupted_tile(self, tile, corruption_type):
        """
        Return a corrupted tile with given shape and corruption type.
        """
        tile_shape = tile.shape
        if corruption_type == 'random':
            corrupted_tile = np.random.random(tile_shape)*255
        if corruption_type == 'white':
            corrupted_tile = np.ones(tile_shape)*255
        if corruption_type == 'black':
            corrupted_tile = np.zeros(tile_shape)
        if corruption_type == 'mean':
            corrupted_tile = np.ones(tile_shape)*np.mean(tile)
        return corrupted_tile

  • 1
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值