转载地址:https://bbs.huaweicloud.com/forum/thread-128287-1-1.html
作者:御坂8080号
硬件平台:X64
操作系统:Win10 64位
安装截图:
体验教程:自动数据增强
教程地址:自动数据增强
在数据增强的步骤中,比较重要的几步分别是:
定义MindSpore算子到AutoAugment算子的映射
PARAMETER_MAX = 10
def float_parameter(level, maxval):
return float(level) * maxval / PARAMETER_MAX
def int_parameter(level, maxval):
return int(level * maxval / PARAMETER_MAX)
def shear_x(level):
v = float_parameter(level, 0.3)
return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, shear=(-v,-v)), c_vision.RandomAffine(degrees=0, shear=(v, v))])
def shear_y(level):
v = float_parameter(level, 0.3)
return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, shear=(0, 0, -v,-v)), c_vision.RandomAffine(degrees=0, shear=(0, 0, v, v))])
def translate_x(level):
v = float_parameter(level, 150 / 331)
return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, translate=(-v,-v)), c_vision.RandomAffine(degrees=0, translate=(v, v))])
def translate_y(level):
v = float_parameter(level, 150 / 331)
return c_transforms.RandomChoice([c_vision.RandomAffine(degrees=0, translate=(0, 0, -v,-v)), c_vision.RandomAffine(degrees=0, translate=(0, 0, v, v))])
def color_impl(level):
v = float_parameter(level, 1.8) + 0.1
return c_vision.RandomColor(degrees=(v, v))
def rotate_impl(level):
v = int_parameter(level, 30)
return c_transforms.RandomChoice([c_vision.RandomRotation(degrees=(-v, -v)), c_vision.RandomRotation(degrees=(v, v))])
def solarize_impl(level):
level = int_parameter(level, 256)
v = 256 - level
return c_vision.RandomSolarize(threshold=(0, v))
def posterize_impl(level):
level = int_parameter(level, 4)
v = 4 - level
return c_vision.RandomPosterize(bits=(v, v))
def contrast_impl(level):
v = float_parameter(level, 1.8) + 0.1
return c_vision.RandomColorAdjust(contrast=(v, v))
def autocontrast_impl(level):
return c_vision.AutoContrast()
def sharpness_impl(level):
v = float_parameter(level, 1.8) + 0.1
return c_vision.RandomSharpness(degrees=(v, v))
def brightness_impl(level):
v = float_parameter(level, 1.8) + 0.1
return c_vision.RandomColorAdjust(brightness=(v, v))
定义ImageNet数据集的AutoAugment策略
imagenet_policy = [
[(posterize_impl(8), 0.4), (rotate_impl(9), 0.6)],
[(solarize_impl(5), 0.6), (autocontrast_impl(5), 0.6)],
[(c_vision.Equalize(), 0.8), (c_vision.Equalize(), 0.6)],
[(posterize_impl(7), 0.6), (posterize_impl(6), 0.6)],
[(c_vision.Equalize(), 0.4), (solarize_impl(4), 0.2)],
[(c_vision.Equalize(), 0.4), (rotate_impl(8), 0.8)],
[(solarize_impl(3), 0.6), (c_vision.Equalize(), 0.6)],
[(posterize_impl(5), 0.8), (c_vision.Equalize(), 1.0)],
[(rotate_impl(3), 0.2), (solarize_impl(8), 0.6)],
[(c_vision.Equalize(), 0.6), (posterize_impl(6), 0.4)],
[(rotate_impl(8), 0.8), (color_impl(0), 0.4)],
[(rotate_impl(9), 0.4), (c_vision.Equalize(), 0.6)],
[(c_vision.Equalize(), 0.0), (c_vision.Equalize(), 0.8)],
[(c_vision.Invert(), 0.6), (c_vision.Equalize(), 1.0)],
[(color_impl(4), 0.6), (contrast_impl(8), 1.0)],
[(rotate_impl(8), 0.8), (color_impl(2), 1.0)],
[(color_impl(8), 0.8), (solarize_impl(7), 0.8)],
[(sharpness_impl(7), 0.4), (c_vision.Invert(), 0.6)],
[(shear_x(5), 0.6), (c_vision.Equalize(), 1.0)],
[(color_impl(0), 0.4), (c_vision.Equalize(), 0.6)],
[(c_vision.Equalize(), 0.4), (solarize_impl(4), 0.2)],
[(solarize_impl(5), 0.6), (autocontrast_impl(5), 0.6)],
[(c_vision.Invert(), 0.6), (c_vision.Equalize(), 1.0)],
[(color_impl(4), 0.6), (contrast_impl(8), 1.0)],
[(c_vision.Equalize(), 0.8), (c_vision.Equalize(), 0.6)],
]
以及在数据读取过程中插入AutoAugment变换
if do_train:
dataset = dataset.map(operations=c_vision.RandomSelectSubpolicy(imagenet_policy), input_columns=["image"])
对比体验
最近在复盘一些kaggle的代码,就正好看了一下常用的数据增强手段在不同框架下的实现
使用pytorch的预处理
import torchvision
from torchvision import transforms as T
trans = T.Compose([
T.Resize(IMAGE_SIZE, IMAGE_SIZE),
T.RandomHorizontalFlip(p=0.5)
T.RandomVerticalFlip(0.5),
T.RandomRotation(90)
])
这里面主要用到了torchvision这个库
其中的API参考是:
https://pytorch.org/vision/stable/transforms.html
而使用mindspore实现相同的功能:
import mindspore.dataset.vision.c_transforms as c_vision
import mindspore.dataset.transforms.c_transforms as c_transforms
trans = [
c_vision.Resize(IMAGE_SIZE, IMAGE_SIZE),
c_vision.RandomHorizontalFlip(0.5),
c_vision.RandomVerticalFlip(0.5),
c_vision.RandomRotation(degrees=90, resample=Inter.NEAREST, expand=True),
]
可以看出来,不需要使用Compose这个API了,相对简洁了一些
API参考:https://www.mindspore.cn/doc/api_python/zh-CN/r1.1/mindspore/mindspore.dataset.vision.html
而在CV的比赛中,我们还会用一个基于OpenCV的库Augmentations,反正就是很快很好用的库
import albumentations as A
trfm = A.Compose([
A.Resize(IMAGE_SIZE, IMAGE_SIZE),
A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.6),
A.RandomRotate90(),
])
API参考:
https://albumentations.readthedocs.io/en/latest/api/augmentations.html#transforms
从这里可以看出来,实现数据增强,用不同框架的方法是差不多的,基础功能上都是差不多的,关键是选择适合的变换方法,才能更好的实现CV比赛的提分
邮箱:hellonexus@qq.com