Tensorflow2.* keras CBAM 代码实现

理论推导

整体模型框架

试验结果显示串联效果优于并联结果,channel attention在前优于spatial attention在前。
因此,最后采取的是一个channel attention module与一个spatial attention module串联的形式。
在这里插入图片描述
表达式如下:
F ′ = M c ( F ) ⨂ F {F}'=M_{c}(F)\bigotimes F F=Mc(F)F
F ′ ′ = M s ( F ′ ) ⨂ F ′ {F}''=M_{s}({F}')\bigotimes {F}' F=Ms(F)F
式中,
F ∈ R C × H × W F \in \mathbb{R}^{C\times H\times W } FRC×H×W表示module输入特征层
M c ∈ R C × 1 × 1 M_{c} \in \mathbb{R}^{C\times 1\times 1 } McRC×1×1表示1D channel attention
M s ∈ R 1 × H × W M_{s} \in \mathbb{R}^{1\times H\times W } MsR1×H×W表示2D spatial attention
⨂ \bigotimes 表示同位元素相乘

Channel attention module

1.同时进行了global maxpool和avgpool;
2. 共享权重且具有瓶颈机制两层全连接;
3. 同位相加后采用sigmoid激活,得到输出;

在这里插入图片描述
因此对应 M c ( F ) M_{c}(F) Mc(F)的表达式如下:
M c ( F ) = σ ( M L P ( A v g P o o l ( F ) ) + M L P ( M a x P o o l ( F ) ) = σ ( W 1 ( W 0 ( F a v g c ) ) + W 1 ( W 0 ( F m a x c ) ) ) \begin{aligned} M_{c}(F)&=\sigma(MLP(AvgPool(F))+MLP(MaxPool(F))\\&=\sigma(W_{1}(W_{0}(F_{avg}^{c}))+W_{1}(W_{0}(F_{max}^{c}))) \end{aligned} Mc(F)=σ(MLP(AvgPool(F))+MLP(MaxPool(F))=σ(W1(W0(Favgc))+W1(W0(Fmaxc)))
式中,
σ \sigma σ表示sigmoid激活
F a v g c F_{avg}^{c} Favgc F m a x c F_{max}^{c} Fmaxc分别表示global avgpooling和maxpooling操作;
W 0 ∈ R C r × C W_{0} \in \mathbb{R}^{\frac{C}{r}\times C } W0RrC×C W 1 ∈ R C × C r W_{1} \in \mathbb{R}^{C \times \frac{C}{r} } W1RC×rC分别表示全连接权重

Spatial attention module

  1. 对输入特征层沿着channels维度分别进行global maxpool和avgpool,并进行堆叠;
  2. 采用7x7卷积核进行卷积操作;
  3. 采用sigmoid激活,得到输出;
    在这里插入图片描述
    M s ( F ) = σ ( f 7 × 7 ( [ A v g P o o l ( F ) ; M a x P o o l ( F ) ] ) ) = σ ( f 7 × 7 ( [ F a v g s ; F m a x s ] ) ) \begin{aligned} M_{s}(F) &= \sigma(f^{7\times 7}([AvgPool(F);MaxPool(F)]))\\&=\sigma(f^{7 \times 7}([F_{avg}^{s};F_{max}^{s}])) \end{aligned} Ms(F)=σ(f7×7([AvgPool(F);MaxPool(F)]))=σ(f7×7([Favgs;Fmaxs]))
    式中:
    σ \sigma σ表示sigmoid激活
    F a v g s ∈ R 1 × H × W F_{avg}^{s} \in \mathbb{R}^{1\times H\times W } FavgsR1×H×W F m a x s ∈ R 1 × H × W F_{max}^{s} \in \mathbb{R}^{1\times H\times W } FmaxsR1×H×W分别表示沿着通道维度进行maxpool和avgpool
    f 7 × 7 f^{7\times7} f7×7表示卷积核尺寸为7x7的卷积操作

对比SENet

相比SENet,CBAM的创新之处是在global pooling的时候同时进行了maxpool和avgpool。
原文介绍
we show that those are suboptimal features in order to infer fine channel attention, and we suggest to use max-pooled features as well;
max-pooling gathers another important clue about distinctive object features to infer finer channel-wise attention;
max-pooled features are as meaningful as average-pooled features, comparing the accuracy improvement from the baseline;
在这里插入图片描述
channel pooling produces better accuracy, indicating that explicitly modeled pooling leads to finer attention inference rather than learnable weighted channel pooling
在这里插入图片描述

代码复现

基于之前搭建的 Tensorflow2.0 keras ResNet 50 101 152系列 代码实现。模型搭建请参看
Tensorflow 2.0 keras.models.Sequential() Model() 创建网络的若干方式 及共享权重问题

import  tensorflow as tf
from    tensorflow import keras
from    tensorflow.keras import layers, models, Sequential
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Dense, Flatten,Reshape, Dropout, BatchNormalization, Activation, GlobalAveragePooling2D
from tensorflow.keras.layers import GlobalMaxPool2D, Concatenate

# 继承Layer,建立resnet50 101 152卷积层模块
def conv_block(inputs, filter_num, reduction_ratio, stride=1, name=None):
    
    x = inputs
    x = Conv2D(filter_num[0], (1,1), strides=stride, padding='same', name=name+'_conv1')(x)
    x = BatchNormalization(axis=3, name=name+'_bn1')(x)
    x = Activation('relu', name=name+'_relu1')(x)

    x = Conv2D(filter_num[1], (3,3), strides=1, padding='same', name=name+'_conv2')(x)
    x = BatchNormalization(axis=3, name=name+'_bn2')(x)
    x = Activation('relu', name=name+'_relu2')(x)

    x = Conv2D(filter_num[2], (1,1), strides=1, padding='same', name=name+'_conv3')(x)
    x = BatchNormalization(axis=3, name=name+'_bn3')(x)

    # Channel Attention
    avgpool = GlobalAveragePooling2D(name=name+'_channel_avgpool')(x)
    maxpool = GlobalMaxPool2D(name=name+'_channel_maxpool')(x)
    # Shared MLP
    Dense_layer1 = Dense(filter_num[2]//reduction_ratio, activation='relu', name=name+'_channel_fc1')
    Dense_layer2 = Dense(filter_num[2], activation='relu', name=name+'_channel_fc2')
    avg_out = Dense_layer2(Dense_layer1(avgpool))
    max_out = Dense_layer2(Dense_layer1(maxpool))

    channel = layers.add([avg_out, max_out])
    channel = Activation('sigmoid', name=name+'_channel_sigmoid')(channel)
    channel = Reshape((1,1,filter_num[2]), name=name+'_channel_reshape')(channel)
    channel_out = tf.multiply(x, channel)
    
    # Spatial Attention
    avgpool = tf.reduce_mean(channel_out, axis=3, keepdims=True, name=name+'_spatial_avgpool')
    maxpool = tf.reduce_max(channel_out, axis=3, keepdims=True, name=name+'_spatial_maxpool')
    spatial = Concatenate(axis=3)([avgpool, maxpool])

    spatial = Conv2D(1, (7,7), strides=1, padding='same',name=name+'_spatial_conv2d')(spatial)
    spatial_out = Activation('sigmoid', name=name+'_spatial_sigmoid')(spatial)

    CBAM_out = tf.multiply(channel_out, spatial_out)

    # residual connection
    r = Conv2D(filter_num[2], (1,1), strides=stride, padding='same', name=name+'_residual')(inputs)
    x = layers.add([CBAM_out, r])
    x = Activation('relu', name=name+'_relu3')(x)

    return x

def build_block (x, filter_num, blocks, reduction_ratio=16, stride=1, name=None):

    x = conv_block(x, filter_num, reduction_ratio, stride, name=name)

    for i in range(1, blocks):
        x = conv_block(x, filter_num, reduction_ratio, stride=1, name=name+'_block'+str(i))

    return x


# 创建resnet50 101 152
def SE_ResNet(Netname, nb_classes):

    ResNet_Config = {'ResNet50':[3,4,6,3],
                    'ResNet101':[3,4,23,3],
                    'ResNet152':[3,8,36,3]}
    layers_dims=ResNet_Config[Netname]

    filter_block1=[64, 64, 256]
    filter_block2=[128,128,512]
    filter_block3=[256,256,1024]
    filter_block4=[512,512,2048]

    # Reduction ratio in four blocks
    SE_reduction=[16,16,16,16]

    img_input = Input(shape=(224,224,3))
    # stem block 
    x = Conv2D(64, (7,7), strides=(2,2),padding='same', name='stem_conv')(img_input)
    x = BatchNormalization(axis=3, name='stem_bn')(x)
    x = Activation('relu', name='stem_relu')(x)
    x = MaxPooling2D((3,3), strides=(2,2), padding='same', name='stem_pool')(x)
    # convolution block
    x = build_block(x, filter_block1, layers_dims[0], SE_reduction[0], name='conv1')
    x = build_block(x, filter_block2, layers_dims[1], SE_reduction[1], stride=2, name='conv2')
    x = build_block(x, filter_block3, layers_dims[2], SE_reduction[2], stride=2, name='conv3')
    x = build_block(x, filter_block4, layers_dims[3], SE_reduction[3], stride=2, name='conv4')
    # top layer
    x = GlobalAveragePooling2D(name='top_layer_pool')(x)
    x = Dense(nb_classes, activation='softmax', name='fc')(x)

    model = models.Model(img_input, x, name=Netname)

    return model
    

if __name__=='__main__':
    model = SE_ResNet('ResNet50', 1000)
    model.summary()
  • 12
    点赞
  • 105
    收藏
    觉得还不错? 一键收藏
  • 4
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值