DropPath正则化

最新推荐文章于 2024-09-12 21:57:03 发布

烟雨行舟#

最新推荐文章于 2024-09-12 21:57:03 发布

阅读量1k

点赞数

分类专栏：机器学习文章标签：神经网络人工智能深度学习

本文链接：https://blog.csdn.net/weixin_47119529/article/details/126549300

版权

机器学习专栏收录该内容

10 篇文章

订阅专栏

在学习VIT-pytorch中看到drop_path，并不是很了解，在查阅以下大佬的博客后有了初步了解，进行一些总结：

1、DropPath或drop_path正则化（通俗易懂）

DropPath或drop_path正则化（通俗易懂）_惊鸿落-Capricorn的博客-CSDN博客_drop path

2、【正则化】DropPath/drop_path用法

【正则化】DropPath/drop_path用法_风巽·剑染春水的博客-CSDN博客_drop path

3、Dropout 和 Drop Path

Dropout 和 Drop Path_LemonShy在搬砖的博客-CSDN博客_droppath和dropout

在VIT中的写法

def drop_path(x, drop_prob: float = 0., training: bool = False):
    """Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).

    This is the same as the DropConnect impl I created for EfficientNet, etc networks, however,
    the original name is misleading as 'Drop Connect' is a different form of dropout in a separate paper...
    See discussion: https://github.com/tensorflow/tpu/issues/494#issuecomment-532968956 ... I've opted for
    changing the layer and argument names to 'drop path' rather than mix DropConnect as a layer name and use
    'survival rate' as the argument.
###  DropPath/drop_path 是一种正则化手段，和Dropout思想类似，其效果是将深度学习模型中的多分支结构的子路径随机”删除“，可以防止过拟合，提升模型表现，而且克服了网络退化问题。
     ResNet缺乏正则方法，本文提出了drop-path，对子路径进行随机丢弃，通过实验表示，残差结构对于深度网络来说不是必须的，路径长度才是训练深度网络的需要的基本组件，而不单单是残差块。

    """
    if drop_prob == 0. or not training:
        return x
    keep_prob = 1 - drop_prob
    shape = (x.shape[0],) + (1,) * (x.ndim - 1)  # work with diff dim tensors, not just 2D ConvNets
    random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
    random_tensor.floor_()  # binarize
    output = x.div(keep_prob) * random_tensor
    return output


class DropPath(nn.Module):
    """Drop paths (Stochastic Depth) per sample  (when applied in main path of residual blocks).
    """  # 每个样本的下落路径（随机深度）（当应用于剩余块的主路径时）

    def __init__(self, drop_prob=None):
        super(DropPath, self).__init__()
        self.drop_prob = drop_prob

    def forward(self, x):
        return drop_path(x, self.drop_prob, self.training)

用法如下所示：

# NOTE: drop path for stochastic depth, we shall see if this is better than dropout here
self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()

def forward(self, x):
    x = x + self.drop_path(self.attn(self.norm1(x)))
    x = x + self.drop_path(self.mlp(self.norm2(x)))
    return x

不能只有一个通路使用drop path，至少有残差连接才行