层归一化的调用命令:
torch.nn.LayerNorm(normalized_shape, eps=1e-05, elementwise_affine=True, device=None, dtype=None)
相应运算的数学表示为:
y
=
x
−
E
[
x
]
V
a
r
[
x
]
+
ϵ
∗
γ
+
β
y=\frac{x-E[x]}{\sqrt{Var[x]+\epsilon}}*\gamma+\beta
y=Var[x]+ϵx−E[x]∗γ+β
其中
E
[
x
]
E[x]
E[x]表示expectation,
V
a
r
[
x
]
Var[x]
Var[x]表示variance,
β
,
γ
\beta,\gamma
β,γ是可学习参数,
ϵ
>
0
\epsilon>0
ϵ>0是一个任意小的数字。
在CV中的应用案例
Image processing Example
N, C, H, W = 12, 3, 256, 256
input = torch.randn(N, C, H, W) # input data
# Normalize over the last three dimensions (i.e. the channel and spatial dimensions)
# as shown in the image below
layer_norm = nn.LayerNorm([C, H, W])
output = layer_norm(input)