Efderain模型中加入SE模块及增强数据

Asker_CXQ

已于 2023-06-05 16:27:14 修改

阅读量307

点赞数

分类专栏： paper 文章标签：机器学习 pytorch

于 2022-08-04 09:39:07 首次发布

本文链接：https://blog.csdn.net/MZ_CXQ/article/details/126153270

版权

paper 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

论文 : EfficientDeRain: Learning Pixel-wise Dilation Filtering for High-Efficiency Single-Image Deraining | Proceedings of the AAAI Conference on Artificial Intelligence

代码 : code

插入SE模块

efderain模型:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Zj8NNsQ2-1659576779521)(https://s3-us-west-2.amazonaws.com/secure.notion-static.com/038ae8d7-c228-4dd9-a27f-b198a6fb47b6/Untitled.png)]

SE block:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-I3dZRMRj-1659576673707)(Efderain%E6%A8%A1%E5%9E%8B%E4%B8%AD%E5%8A%A0%E5%85%A5SE%E6%A8%A1%E5%9D%97%E5%8F%8A%E5%A2%9E%E5%BC%BA%E6%95%B0%E6%8D%AE%20a72a3b28d254443cb8a1b8681feb31fa/Untitled%201.png)]

SE block代码示例:

# ----------------------------------------
#       SE-BLOCK(channal_attention)
# ----------------------------------------
class SE_block(nn.Module):
    # SE block doesn't change the shape of the feature map
    def __init__(self, channel):
        super().__init__()
        self.se = nn.Sequential(
            nn.AdaptiveAvgPool2d((1, 1)),  # global pooling
            nn.Conv2d(channel, channel//16 if channel >= 64 else channel, kernel_size=1),  
						# kernel_size = 1 相当于全连接 16是论文中设置的数，目的是减小参数
						# channel 较小时就不用除了
            nn.ReLU(),
            nn.Conv2d(channel//16 if channel >= 64 else channel, channel, kernel_size=1),
            nn.Sigmoid()
        )
    def forward(self, x):
        # x : feature map
        channel_weight = self.se(x)
        x = x*channel_weight  #scale
        return x

SE模块加在了

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-tLeps2F0-1659576673707)(Efderain%E6%A8%A1%E5%9E%8B%E4%B8%AD%E5%8A%A0%E5%85%A5SE%E6%A8%A1%E5%9D%97%E5%8F%8A%E5%A2%9E%E5%BC%BA%E6%95%B0%E6%8D%AE%20a72a3b28d254443cb8a1b8681feb31fa/Untitled%202.png)]

数据增强

做了一点小小的添加(也不知道有没有用)

在 augment_and_mix.py 中添加了scale缩小操作

# cxq 将图像随机缩小0.4~0.8倍
def scale(image):
    image = np.clip(image * 255., 0, 255).astype(np.uint8)
    pil_img = Image.fromarray(image)  # Convert to PIL.Image
    pil_img = transforms.RandomAffine(degrees=0, scale=(0.4, 0.8))(pil_img)
    return np.asarray(pil_img) / 255.

----------------------------------------------------------------------------
def augment_and_mix(image, severity=3, width=3, depth=-1, alpha=1.):
    """Perform AugMix augmentations and compute mixture.

    Args:
      image: Raw input image as float32 np.ndarray of shape (h, w, c)
      severity: Severity of underlying augmentation operators (between 1 to 10).
      width: Width of augmentation chain
      depth: Depth of augmentation chain. -1 enables stochastic depth uniformly
             from [1, 3]
      alpha: Probability coefficient for Beta and Dirichlet distributions.

    Returns:
      mixed: Augmented and mixed image.
    """
    width += 1  # cxq

    ws = np.float32(
        np.random.dirichlet(
            [alpha] * width))  # ws like [0.01519703 0.3264288  0.6583742 ] 其值即为论文中的w_i 与论文中不同，这里只有3个，文中是4个
    m = np.float32(np.random.beta(alpha, alpha))  # m is a scaler from 0~1 即为论文中的w

    mix = np.zeros_like(image)  # 即论文中的 R_mix

    for i in range(width-1):  # i = 0, 1, 2
        image_aug = image.copy()
        depth = depth if depth > 0 else np.random.randint(2, 4)  # [2, 4) 也就是说当depth设置为-1时， 会随机从 2, 3 中选择
        for _ in range(depth):
            op = np.random.choice(augmentations.augmentations)
            # print(op)
            image_aug = apply_op(image_aug, op, severity)  # 在image_aug上实施操作op, 实施的程度(level) 是 severity
        # Preprocessing commutes since all coefficients are convex
        mix += ws[i] * normalize(image_aug)

    # cxq--------------------------------------------------------------------
    image_aug = mix.copy()
    mix += ws[width-1] * normalize(scale(image_aug))
    # cxq--------------------------------------------------------------------

    max_ws = max(ws)
    rate = 1.0 / max_ws
    # print(rate)

    # mixed = (random.randint(5000, 9000)/10000) * normalize(image) + (random.randint((int)(rate*3000), (int)(rate*10000))/10000) * mix
    mixed = max((1 - m), 0.7) * normalize(image) + max(m, rate * 0.5) * mix
    # 实际用的式子， 好像控制了R_mix的成分 不太清楚rate是拿来干嘛的
    # rate是因为level不同而控制的？op里也有个rate

    # mixed = (1 - m) * normalize(image) + m * mix  # 这是论文算法中用的式子
    return mixed