torch.nn.BatchNorm3d

最新推荐文章于 2025-04-25 20:33:30 发布

Guan19

最新推荐文章于 2025-04-25 20:33:30 发布

阅读量5.9k

点赞数 2

文章标签： pytorch

原文链接：https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm3d.html?highlight=torch%20nn%20batchnorm3d#torch.nn.BatchNorm3d

版权

本文详细介绍了PyTorch中nn.BatchNorm3d的功能及使用方法。该层应用于5D输入，可在训练过程中减少内部协变量偏移，提高深度网络训练速度。文中还提供了实例演示如何创建带及不带可学习参数的批量归一化层。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

CLASS torch.nn.BatchNorm3d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

正如论文 Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 所描述的，在一个5D输入上应用Batch Normalization。

The mean and standard-deviation are calculated per-dimension over the mini-batches. $\gamma$ 和 $\beta$ 是可学习的大小为C（C是输入的大小）的参数向量。默认情况下， $\gamma$ 被设为1， $\beta$ 被设为0。通过有偏估计计算标准差，等同于torch.var(input, unbiased=False)。

默认情况下，在训练时，这层保持计算均值和方差的估计，这个估计被用于验证时的normalization。运行的估计以动量0.1被保持。

如果track_running_stats 被设置成 False ，这层不保持运行估计，而且在验证时，也使用批统计。

注意：这里的参数momentum跟优化器里用的momentum不是一个概念。从数学上来说，运行统计的更新规则是 $\hat x_{new} = (1-momentum)\times \hat x + momentum \times x_t$ ，这里 $\hat x$ 是估计的统计量，而且 $x_t$ 是新的观察值。

因为Batch Normalization是在通道维度上做的，即在（N, D, H, W）上计算统计量，常用的术语是把它叫做Volumetric Batch Normalization或Spatio-temporal Batch Normalization。

Example：

>>> # With Learnable Parameters
>>> m = nn.BatchNorm3d(100)
>>> # Without Learnable Parameters
>>> m = nn.BatchNorm3d(100, affine=False)
>>> input = torch.randn(20, 100, 35, 45, 10)
>>> output = m(input)