Deep learning前的图像预处理

(1)调节尺寸

As data(Images) few into the NN should be scaled according the image size that the NN is designed to take, usually a square i.e 100x100,250x250


(2) 特征标准化(Feature Standardization)

  • 【参考文献】
    • https://blog.csdn.net/xjp_xujiping/article/details/102981133
    • https://blog.csdn.net/kane7csdn/article/details/86475918
    • https://blog.csdn.net/dengheng4891/article/details/101446368

特征标准化(Feature Standardization)
特征标准化的目的是使数据集中所有特征都具有零均值和单位方差,即数据的每一个维度具有零均值和单位方差,这也是比较常见的一种归一化方法,比如使用SVM时候也要进行类似处理。在实际应用中,特征标准化的具体做法是:首先计算每一个维度上数据的均值(使用全体数据计算),之后在每一个维度上都减去该均值。下一步便是在数据的每一维度上除以该维度上数据的标准差。

例如(X)是一个训练样本集,包含m个训练样本且每个训练样本的维数是n。应用特征标准化时先计算各行数据的均值,然后样本集(X)减去该均值得到零均值化后的样本集(X{’})。之后(X{’})的各行除以该行数据的标准差就会得到特征标准化后的样本。

若输入是自然彩色图像,由于色彩通道间并不存在平稳特性,因此通常对数据进行特征缩放(使像素值位于 [0,1] 区间)。然后再进行PCA/ZCA白化等操作,在白化前需进行特征分量均值归零(即使特征的每一个维度具有零均值,通常不需要除以各维度数据的标准差,因为各维度标准差很接近)。在UFLDL教程的练习中(linear decoder)采用的是这种方法,而在有些论文中,也会采用第二种方法(逐样本去均值和除以标准差),如论文“An Analysis of Single-Layer Networks in Unsupervised Feature Learning”,这样在后续白化处理时是不是还需要再对各维度进行零均值化(因为计算协方差矩阵时需要这一步)。

Here is a explanation of it from Stanford CS231n 2016 Lectures[https://cs231n.github.io/neural-networks-2/].

Normalization refers to normalizing the data dimensions so that they are of approximately the same scale. For Image data There are two common ways of achieving this normalization.

(一) 减均值Mean subtraction

One is to divide each dimension by its standard deviation, once it has been zero-centered:
(X /= np.std(X, axis = 0)).
先计算训练集均值,后用训练和测试数据再减去均值

  • 【参考文献】:
    • https://blog.csdn.net/Miss_yuki/article/details/80662017
    • https://niuyuanyuanna.github.io/2018/11/08/deep_learning/data-normalization/#%E9%80%90%E6%A0%B7%E6%9C%AC%E5%9D%87%E5%80%BC%E6%B6%88%E5%87%8Fper-example-mean-subtraction

(二)Normalization, 特征缩放至[-1,1]

Another form of this preprocessing normalizes each dimension so that the min and max along the dimension is -1 and 1 respectively. It only makes sense to apply this preprocessing if you have a reason to believe that different input features have different scales (or units), but they should be of approximately equal importance to the learning algorithm. In case of images, the relative scales of pixels are already approximately equal (and in range from 0 to 255), so it is not strictly necessary to perform this additional preprocessing step.


(3)降维PCA和白化:

Dimensionality reduction RGB to Grayscale image, neural network performance is allowed to be invariant to that dimension, or to make the training problem more tractable


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值