python实现pytorch中的Normalization

最新推荐文章于 2024-06-01 16:52:33 发布

__main__

最新推荐文章于 2024-06-01 16:52:33 发布

阅读量2.6k

点赞数

分类专栏：深度学习文章标签： pytorch 深度学习

本文链接：https://blog.csdn.net/baidu_32885165/article/details/110008406

版权

深度学习专栏收录该内容

17 篇文章 1 订阅

订阅专栏

Normalization原理

BatchNorm：对mini-batch的每一层做归一化，算NHW的均值，对小batchsize效果不好；BN主要缺点是对batchsize的大小比较敏感，由于每次计算均值和方差是在一个batch上，所以如果batchsize太小，则计算的均值、方差不足以代表整个数据分布，
LayerNorm：对mini-batch中的每一个sample做归一化，算CHW的均值，主要对RNN作用明显；
InstanceNorm：一个mini-batch中每一个sample中的每一个channal做归一化，算HW的均值，用在风格化迁移；因为在图像风格化中，生成结果主要依赖于某个图像实例，所以对整个batch归一化不适合图像风格化中，因而对HW做归一化。可以加速模型收敛，并且保持每个图像实例之间的独立。

bn = torch.nn.BatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
#input: (N, C, L) num_features=C or (N, L) num_features=L  
#output:(N, C, L) or  (N, L)
bn = torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 
#input: (N, C, H,W) num_features=C  
#output:(N, C, H, W)
bn = torch.nn.BatchNorm3d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 
#input: (N, C, D, H, W)   
#output:(N, C, D, H, W)

ln = torch.nn.LayerNorm(normalized_shape, eps=1e-05, elementwise_affine=True)
#input: (N, *) , normalized_shape = features,size()[1: ]
#output:(N, *)

如下图所示，其实BN, LN, IN都是对输入数据做归一化，都是实现如下公式，重点在于这三组归一化中对那些维度求mean和std
$\frac{x-mean}{std}*gamma +beta$

gamma和beta是缩放和平移参数，，想象一下，当gamma和beta分别为方差和均值是就回到了BN之前的状态，在训练时这两个参数是有梯度的，测试时不需要
momentum用于对均值和方差进行动量更新，在torch中一般取(0.1 0.01 0.001)，计算的结果在训练阶段用不到，是为测试时所保留的，具体计算方式如下：
$r u n n i n g m e a n = (1 - m o m e n t u m) * r u n n i n g m e a n + m o m e n t u m * r u n n i n g m e a n$
$r u n n i n g s t d = (1 - m o m e n t u m) * r u n n i n g s t d + m o m e n t u m * r u n n i n g s t d$

在这里插入图片描述

python实现normalization

BatchNorm

torch实现

import torch
inp = torch.rand(2,3,4)
bn = torch.nn.BatchNorm1d(3,affine=False) #affine设置为false可以使gamma 和 beta为 None
bn_out = bn(inp)
>>inp
>>tensor([[[0.1319, 0.1956, 0.2887, 0.1659],
           [0.2575, 0.5717, 0.8141, 0.3247],
           [0.2323, 0.4682, 0.6605, 0.5281]],
          [[0.6290, 0.2745, 0.4449, 0.2501],
           [0.0568, 0.1737, 0.3457, 0.9583],
           [0.8578, 0.0243, 0.5147, 0.2992]]])
>>bn_out
>>tensor([[[-1.0761, -0.6623, -0.0576, -0.8551],
           [-0.6099,  0.4529,  1.2728, -0.3827],
           [-0.8881,  0.0827,  0.8736,  0.3289]],
          [[ 2.1523, -0.1501,  0.9568, -0.3080],
           [-1.2888, -0.8934, -0.3115,  1.7605],
           [ 1.6855, -1.7436,  0.2738, -0.6128]]])

python实现

inp1 = inp.permute(1,2,0).reshape(inp.size()[1],-1)
mean = inp1.mean(-1).reshape(1, inp.size()[1], 1)
std = inp1.std(-1, unbiased=False).reshape(reshape(1, inp.size()[1], 1)) 
#unbiased=True时为无偏估计，即[(x1-x)^2+(x2-x)^2...(xn-x)^2]/(n-1)
#unbiased=False时, 方差计算为[(x1-x)^2+(x2-x)^2...(xn-x)^2]/(n)
mybn_out = (inp-mean)/std
>>mybn_out
>>tensor([[[-1.0762, -0.6624, -0.0576, -0.8553],
           [-0.6099,  0.4529,  1.2729, -0.3826],
           [-0.8881,  0.0825,  0.8738,  0.3290]],
          [[ 2.1528, -0.1499,  0.9570, -0.3084],
           [-1.2888, -0.8934, -0.3116,  1.7606],
           [ 1.6856, -1.7439,  0.2739, -0.6128]]])

__main__

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
python实现pytorch中的Normalization

Normalization原理bn = torch.nn.BatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)#input: (N, C, L) num_features=C or (N, L) num_features=L #output:(N, C, L) or (N, L)bn = torch.nn.BatchNorm1d(num_features, eps=1e-
复制链接

扫一扫