BatchNorm、LayerNorm详细过程及示例_Pytorch

最新推荐文章于 2024-06-16 13:19:30 发布

PuJiang-

最新推荐文章于 2024-06-16 13:19:30 发布

阅读量4.2k

点赞数 18

分类专栏： Transformer

本文链接：https://blog.csdn.net/jump882/article/details/119795466

版权

Transformer 专栏收录该内容

4 篇文章 1 订阅

订阅专栏

一、BatchNorm

1、对batch中的每个维度进行归一化

$y=\frac{x-E[x]}{\sqrt{Var[x]}}$ 其中： $E[x]=\frac{1}{n}\sum_{i=1}^nx_i,Var[x]=\frac{1}{n}\sum_{i=1}^n(x_i-E[x])^2$
在这里插入图片描述

2、pytorch示例

import torch
from torch import nn
m = nn.BatchNorm1d(3)
input = torch.tensor([[10., 2., 100.], 
                      [2., 3., 100.], 
                      [3., 40., 400.], 
                      [5., 3., 200.]], dtype=torch.float32)
print(input)
output = m(input)
print(output)

tensor([[ 10.,   2., 100.],
        [  2.,   3., 100.],
        [  3.,  40., 400.],
        [  5.,   3., 200.]])
tensor([[ 1.6222, -0.6184, -0.8165],
        [-0.9733, -0.5566, -0.8165],
        [-0.6489,  1.7315,  1.6330],
        [ 0.0000, -0.5566,  0.0000]], grad_fn=<NativeBatchNormBackward>)

二、LayerNorm

1、对一个样本的d个维度进行归一化

$y=\frac{x-E[x]}{\sqrt{Var[x]}}$ 其中： $E[x]=\frac{1}{n}\sum_{i=1}^nx_i,Var[x]=\frac{1}{n}\sum_{i=1}^n(x_i-E[x])^2$
在这里插入图片描述

2、pytorch示例

import torch
from torch import nn
n = nn.LayerNorm(3)
input = torch.tensor([[10., 2., 12.],
                      [2., 3., 7.],
                      [100., 10., 1.],
                      [1., 1., 1.]], dtype=torch.float32)
print(input)
output = n(input)
print(output)

tensor([[ 10.,   2.,  12.],
        [  2.,   3.,   7.],
        [100.,  10.,   1.],
        [  1.,   1.,   1.]])
tensor([[ 0.4629, -1.3887,  0.9258],
        [-0.9258, -0.4629,  1.3887],
        [ 1.4094, -0.6040, -0.8054],
        [ 0.0000,  0.0000,  0.0000]], grad_fn=<NativeLayerNormBackward>)

PuJiang-

关注

18
点赞
踩
51

收藏

觉得还不错? 一键收藏
4
评论
BatchNorm、LayerNorm详细过程及示例_Pytorch

一、BatchNorm1、对batch中的每个维度进行归一化y=x−E[x]Var[x]y=\frac{x-E[x]}{\sqrt{Var[x]}}y=Var[x]x−E[x]其中：E[x]=1n∑i=1nxi,Var[x]=1n∑i=1n(xi−E[x])2E[x]=\frac{1}{n}\sum_{i=1}^nx_i,Var[x]=\frac{1}{n}\sum_{i=1}^n(x_i-E[x])^2E[x]=n1∑i=1nxi,Var[x]=n1∑i=1n(xi−E[x])22
复制链接

扫一扫

专栏目录