深度学习中的数据增广

最新推荐文章于 2024-08-15 13:53:48 发布

浩瀚之水_csdn

最新推荐文章于 2024-08-15 13:53:48 发布

阅读量9.7k

点赞数 3

分类专栏： # 深度学习-数据集（VOC）

深度学习-数据集（VOC）专栏收录该内容

9 篇文章 1 订阅

订阅专栏

问题一：为什么需要大量的数据

当训练机器学习模型的时候，实际上实在调整它的参数，使得可以跟一个特定的输入符合。优化的目标是 chase that sweet spot where our model’s loss is low。当前最好的神经网络拥有的参数量是上百万的量级。 因此，有这么多的参数，就需要a proportional amount of examples 来学习这些参数。

此外，通过数据增广提升数据集中的相关数据，能防止网络学习到不相关的特征，更多的学到更数据有关的性能，显著的提升整体的性能。

问题二：在什么地方做数据增广？

offline augmentation: 适合相对小一些的数据集；原始数据集的数量跟采用的增广方法成正比。
online augmentation: 适合大一些的数据集；承担不起向前者那样的成倍增广，更适合on the mini-batches做增广。一些机器学习框架也支持被GPU加速过的在线增广。

Popular augmentation techniques

1. Flip

水平或者垂直翻转图像。

# NumPy.'img' = A single image.
flip_1 = np.fliplr(img) # 水平翻转

# TensorFlow. 'x' = A placeholder for an image.
shape = [height, width, channels]
x = tf.placeholder(dtype = tf.float32, shape = shape)
flip_2 = tf.image.flip_up_down(x)
flip_3 = tf.image.flip_left_right(x)
flip_4 = tf.image.random_flip_up_down(x)
flip_5 = tf.image.random_flip_left_right(x)

2. Rotation

对这个操作需要特别注意的是：图像的维数不会被保留。

3. Scale

图像可以被向内或向外缩放，当向外缩放的时候，最终的图像大小比原始图像大，很多框架从中crop跟原图一样大的部分。

# Scikit Image. 'img' = Input Image, 'scale' = Scale factor
# For details about 'mode', checkout the interpolation section below.
scale_out = skimage.transform.rescale(img, scale=2.0, mode='constant')
scale_in = skimage.transform.rescale(img, scale=0.5, mode='constant')
# Don't forget to crop the images back to the original size (for 
# scale_out)

4. Crop

不像缩放，裁剪仅仅从原始图像随机采样，然后resize到跟原来一样大。

# TensorFlow. 'x' = A placeholder for an image.
original_size = [height, width, channels]
x = tf.placeholder(dtype = tf.float32, shape = original_size)
# Use the following commands to perform random crops
crop_size = [new_height, new_width, channels]
seed = np.random.randint(1234)
x = tf.random_crop(x, size = crop_size, seed = seed)
output = tf.images.resize_images(x, size = original_size)

5. Translation

平移仅仅包括将图像沿着X或者Y方向移动。

6. Gaussian Noise

当网络尝试去学习高频特征的时候很容易过拟合。零均值高斯噪声能有效的distorting高斯噪声，这也意味着低频部分（通常是想要的部分）也会损毁，但是你的网络能从中学到目标信息。Adding just the right amount of noise can enhance the learning capability.

还可以加椒盐噪声，视觉效果类似于高斯噪声，但是信息损失的更少。

#TensorFlow. 'x' = A placeholder for an image.
shape = [height, width, channels]
x = tf.placeholder(dtype = tf.float32, shape = shape)
# Adding Gaussian noise
noise = tf.random_normal(shape=tf.shape(x), mean=0.0, stddev=1.0,
dtype=tf.float32)
output = tf.add(x, noise)

牢记脑中：当做数据增广的时候，要确保不要增加不相关的数据。