注意力机制 SENet、CBAM

最新推荐文章于 2024-05-20 20:52:35 发布

あずにゃん

最新推荐文章于 2024-05-20 20:52:35 发布

阅读量2.9k

点赞数 7

分类专栏：人工智能 TensorFlow 文章标签： TensorFlow 2.0

本文链接：https://blog.csdn.net/zimiao552147572/article/details/104201845

版权

人工智能同时被 2 个专栏收录

503 篇文章 238 订阅

订阅专栏

TensorFlow

113 篇文章 9 订阅

订阅专栏

日萌社

人工智能AI：Keras PyTorch MXNet TensorFlow PaddlePaddle 深度学习实战（不定时更新）

Encoder编码器-Decoder解码器框架 + Attention注意力机制

Pytorch：Transformer(Encoder编码器-Decoder解码器、多头注意力机制、多头自注意力机制、掩码张量、前馈全连接层、规范化层、子层连接结构、pyitcast) part1

Pytorch：Transformer(Encoder编码器-Decoder解码器、多头注意力机制、多头自注意力机制、掩码张量、前馈全连接层、规范化层、子层连接结构、pyitcast) part2

Pytorch：使用Transformer构建语言模型

Pytorch：解码器端的Attention注意力机制、seq2seq模型架构实现英译法任务

BahdanauAttention注意力机制、LuongAttention注意力机制

BahdanauAttention注意力机制：基于seq2seq的西班牙语到英语的机器翻译任务、解码器端的Attention注意力机制、seq2seq模型架构

图片的描述生成任务、使用迁移学习实现图片的描述生成过程、CNN编码器+RNN解码器(GRU)的模型架构、BahdanauAttention注意力机制、解码器端的Attention注意力机制

注意力机制、bmm运算

注意力机制 SENet、CBAM

机器翻译 MXNet（使用含注意力机制的编码器—解码器，即 Encoder编码器-Decoder解码器框架 + Attention注意力机制）

基于Seq2Seq的中文聊天机器人编程实践（Encoder编码器-Decoder解码器框架 + Attention注意力机制）

基于Transformer的文本情感分析编程实践（Encoder编码器-Decoder解码器框架 + Attention注意力机制 + Positional Encoding位置编码）

注意：这一文章“基于Transformer的文本情感分析编程实践（Encoder编码器-Decoder解码器框架 + Attention注意力机制 + Positional Encoding位置编码）”
	该文章实现的Transformer的Model类型模型，实际是改造过的特别版的Transformer，因为Transformer的Model类型模型中只实现了Encoder编码器，
	而没有对应实现的Decoder解码器，并且因为当前Transformer的Model类型模型处理的是分类任务，
	所以我们此处只用了Encoder编码器来提取特征，最后通过全连接层网络来拟合分类。

nltk 处理文本

卷积神经网络处理文本：word2vec、TF-IDF、TextRank、字符卷积、词卷积、卷积神经网络文本分类模型的实现(Conv1D一维卷积、Conv2D二维卷积)

反向传播、链式求导

梯度下降

最小二乘法(LS算法)：实际为L2范数的一个具体应用(计算残差平方和)

线性回归例子

1.SENet模块
	def SE_moudle(input_xs,reduction_ratio = 16.):
		shape = input_xs.get_shape().as_list()
		se_module = tf.reduce_mean(input_xs,[1,2])
		#第一个Dense：shape[-1]/reduction_ratio：即把input_channel再除以reduction_ratio，使channel下降到指定维度数
		se_module = tf.keras.layers.Dense(shape[-1]/reduction_ratio,activation=tf.nn.relu)(se_module)
		#第二个Dense：重新回升到与input_channel相同的原始维度数
		se_module = tf.keras.layers.Dense(shape[-1], activation=tf.nn.relu)(se_module)
		se_module = tf.nn.sigmoid(se_module)
		se_module = tf.reshape(se_module,[-1,1,1,shape[-1]])
		out_ys = tf.multiply(input_xs,se_module)
	return out_ys

2.加载SENet模块的Resnet
	def identity_block(input_xs, out_dim, with_shortcut_conv_BN=False):
		if with_shortcut_conv_BN:
			pass
		else:
			#返回与input的形状和内容均相同的张量，即shortcut等同于input_xs
			shortcut = tf.identity(input_xs)
		#input输入的channel数
		input_channel = input_xs.get_shape().as_list()[-1]
		#如果输入的channel数不等于输出的channel数的话
		if input_channel != out_dim:
			#求输出的channel数减去输入的channel数的绝对值，作为pad填充值
			pad_shape = tf.abs(out_dim - input_channel)
			#name="padding"表示给该填充操作赋予名称为"padding"。使用了默认参数mode='CONSTANT'和constant_values=0，表示填充默认值0。
			#第二个参数为paddings填充的形状：即分别的批量维度、高、宽的维度上都不作填充，在channel维度上填充pad_shape//2的数量。
			shortcut = tf.pad(shortcut, [[0, 0], [0, 0], [0, 0], [pad_shape // 2, pad_shape // 2]], name="padding")
		#残差卷积块中的3个Conv2D卷积的卷积核大小分别为1x1、3x3、1x1
		conv = tf.keras.layers.Conv2D(filters=out_dim // 4, kernel_size=1, padding="SAME", activation=tf.nn.relu)(input_xs)
		conv = tf.keras.layers.BatchNormalization()(conv)
		conv = tf.keras.layers.Conv2D(filters=out_dim // 4, kernel_size=3, padding="SAME", activation=tf.nn.relu)(conv)
		conv = tf.keras.layers.BatchNormalization()(conv)
		conv = tf.keras.layers.Conv2D(filters=out_dim // 4, kernel_size=1, padding="SAME", activation=tf.nn.relu)(conv)
		conv = tf.keras.layers.BatchNormalization()(conv)
		#下面开始加载SENet模块
		#返回的为[批量维度、高、宽、channel维度]
		shape = conv.get_shape().as_list()
		#默认参数为keepdims=False的话，不会再保留运算所在的维度。设置keepdims=True的话，会保留运算所在的维度为1。
		#[批量维度、高、宽、channel维度]经过reduce_mean后转换为[批量维度、channel维度]
		se_module = tf.reduce_mean(conv, [1, 2])
		#第一个Dense：shape[-1]/reduction_ratio：即把input_channel再除以reduction_ratio，使channel下降到指定维度数
		se_module = tf.keras.layers.Dense(shape[-1] / 16, activation=tf.nn.relu)(se_module)
		#第二个Dense：重新回升到与input_channel相同的原始维度数
		se_module = tf.keras.layers.Dense(shape[-1], activation=tf.nn.relu)(se_module)
		se_module = tf.nn.sigmoid(se_module)
		#把[批量维度、channel维度]重新转换为[批量维度、高、宽、channel维度]，即[批量维度、1、1、channel维度]
		se_module = tf.reshape(se_module, [-1, 1, 1, shape[-1]])
		#multiply元素乘法：SENet模块输出值se_module 和 残差卷积输出conv(即SENet模块输入值conv)
		se_module = tf.multiply(conv, se_module)
		#残差连接：对残差的原始输入shortcut(即input_xs) 与 SENet模块输出值se_module 进行求和
		output_ys = tf.add(shortcut, se_module)
		output_ys = tf.nn.relu(output_ys)
		return output_ys

3.CBAM模块
	def cbam_module(input_xs, reduction_ratio=0.5):
		#分别获取批量大小、通道数，通道数作为隐藏层的神经元数量
		batch_size, hidden_num = input_xs.get_shape().as_list()[0], input_xs.get_shape().as_list()[3]
		### 第一步：Channel Attention 模块 ###
		#1.默认参数为keepdims=False的话，不会再保留运算所在的维度。设置keepdims=True的话，会保留运算所在的维度为1。
		#  连续两次的reduce_max/reduce_mean，实际都是先将[批量维度、高、宽、channel维度]转换为[批量维度、1、宽、channel维度]，
		#  再转换为[批量维度、1、1、channel维度]。
		#2.首先对Channel Attention 模块的输入数据分别进行全局池化(reduce_max)和平均池化(reduce_mean)两种操作。
		maxpool_channel = tf.reduce_max(tf.reduce_max(input_xs, axis=1, keepdims=True), axis=2, keepdims=True)
		avgpool_channel = tf.reduce_mean(tf.reduce_mean(input_xs, axis=1, keepdims=True), axis=2, keepdims=True)
		maxpool_channel = tf.keras.layers.Flatten()(maxpool_channel)
		avgpool_channel = tf.keras.layers.Flatten()(avgpool_channel)
		#使用2个连续的全连接层对全局池化(reduce_max)后的特征进行提取。
		#reduction_ratio为0.5，即第1个全连接层的神经元数量为输入数据的输入数据的通道数的一半，然后第2个全连接层神经元数量重新恢复为输入数据的通道数
		mlp_1_max = tf.keras.layers.Dense(units=int(hidden_num * reduction_ratio), activation=tf.nn.relu)(maxpool_channel)
		mlp_2_max = tf.keras.layers.Dense(units=hidden_num)(mlp_1_max)
		#[批量维度、1、1、channel维度]，此处的隐藏层的神经元数量hidden_num等于输入数据的通道数
		mlp_2_max = tf.reshape(mlp_2_max, [-1, 1, 1, hidden_num])
		#使用2个连续的全连接层对平均池化(reduce_mean)后的特征进行提取。
		#reduction_ratio为0.5，即第1个全连接层的神经元数量为输入数据的输入数据的通道数的一半，然后第2个全连接层神经元数量重新恢复为输入数据的通道数
		mlp_1_avg = tf.keras.layers.Dense(units=int(hidden_num * reduction_ratio), activation=tf.nn.relu)(avgpool_channel)
		mlp_2_avg = tf.keras.layers.Dense(units=hidden_num, activation=tf.nn.relu)(mlp_1_avg)
		#[批量维度、1、1、channel维度]，此处的隐藏层的神经元数量hidden_num等于输入数据的通道数
		mlp_2_avg = tf.reshape(mlp_2_avg, [-1, 1, 1, hidden_num])
		#把“对全局池化(reduce_max)提取后的”特征和“对平均池化(reduce_mean)提取后的”特征进行求和，然后通过sigmoid激活归一化到0到1之间	
		channel_attention = tf.nn.sigmoid(mlp_2_max + mlp_2_avg)
		#把“通过sigmoid激活归一化的”值和Channel Attention 模块的输入数据进行内积计算，其最终计算结果值作为后面的Spatial Attention 模块的输入
		channel_refined_feature = input_xs * channel_attention

		### 第二步：Spatial Attention 模块 ###
		#1.首先把Channel Attention 模块的输出作为Spatial Attention 模块的输入，然后对输入数据分别进行全局池化(reduce_max)和平均池化(reduce_mean)两种操作。
		#2.默认参数为keepdims=False的话，不会再保留运算所在的维度。设置keepdims=True的话，会保留运算所在的维度为1。
		#3.全局池化(reduce_max)：把[批量维度、高、宽、channel维度]转换为[批量维度、高、宽、1]
		maxpool_spatial = tf.reduce_max(channel_refined_feature, axis=3, keepdims=True)
		#平均池化(reduce_mean)：把[批量维度、高、宽、channel维度]转换为[批量维度、高、宽、1]
		avgpool_spatial = tf.reduce_mean(channel_refined_feature, axis=3, keepdims=True)
		#把全局池化(reduce_max)和平均池化(reduce_mean)后的数据在channel维度上进行合并，最终转换为[批量维度、高、宽、2]
		max_avg_pool_spatial = tf.concat([maxpool_spatial, avgpool_spatial], axis=3)
		#目的是将数据的维度降维为1，以便后面的sigmoid激活归一化计算
		conv_layer = tf.keras.layers.Conv2D(filters=1, kernel_size=(3, 3), padding="same", activation=None)(max_avg_pool_spatial)
		#通过sigmoid激活归一化到0到1之间
		spatial_attention = tf.nn.sigmoid(conv_layer)
		#将Spatial Attention 模块的输出channel_refined_feature 和Spatial Attention 模块的输出spatial_attention 进行内积计算，
		#其最终计算结果值作为CBAM注意力机制模块的输出值。
		refined_feature = channel_refined_feature * spatial_attention
		#将CBAM注意力机制模块的输入值input_xs和输出值refined_feature进行求和得出最终的结果
		output_layer = refined_feature + input_xs
		return output_layer

4.加载了CBAM模块的Resnet
	#CBAM模块
	import tensorflow as tf
	def cbam_module(input_xs, reduction_ratio=0.5):
		#分别获取批量大小、通道数，通道数作为隐藏层的神经元数量
		batch_size, hidden_num = input_xs.get_shape().as_list()[0], input_xs.get_shape().as_list()[3]
		### 第一步：Channel Attention 模块 ###
		#1.默认参数为keepdims=False的话，不会再保留运算所在的维度。设置keepdims=True的话，会保留运算所在的维度为1。
		#  连续两次的reduce_max/reduce_mean，实际都是先将[批量维度、高、宽、channel维度]转换为[批量维度、1、宽、channel维度]，
		#  再转换为[批量维度、1、1、channel维度]。
		#2.首先对Channel Attention 模块的输入数据分别进行全局池化(reduce_max)和平均池化(reduce_mean)两种操作。
		maxpool_channel = tf.reduce_max(tf.reduce_max(input_xs, axis=1, keepdims=True), axis=2, keepdims=True)
		avgpool_channel = tf.reduce_mean(tf.reduce_mean(input_xs, axis=1, keepdims=True), axis=2, keepdims=True)
		maxpool_channel = tf.keras.layers.Flatten()(maxpool_channel)
		avgpool_channel = tf.keras.layers.Flatten()(avgpool_channel)
		#使用2个连续的全连接层对全局池化(reduce_max)后的特征进行提取。
		#reduction_ratio为0.5，即第1个全连接层的神经元数量为输入数据的输入数据的通道数的一半，然后第2个全连接层神经元数量重新恢复为输入数据的通道数
		mlp_1_max = tf.keras.layers.Dense(units=int(hidden_num * reduction_ratio), activation=tf.nn.relu)(maxpool_channel)
		mlp_2_max = tf.keras.layers.Dense(units=hidden_num)(mlp_1_max)
		#[批量维度、1、1、channel维度]，此处的隐藏层的神经元数量hidden_num等于输入数据的通道数
		mlp_2_max = tf.reshape(mlp_2_max, [-1, 1, 1, hidden_num])
		#使用2个连续的全连接层对平均池化(reduce_mean)后的特征进行提取。
		#reduction_ratio为0.5，即第1个全连接层的神经元数量为输入数据的输入数据的通道数的一半，然后第2个全连接层神经元数量重新恢复为输入数据的通道数
		mlp_1_avg = tf.keras.layers.Dense(units=int(hidden_num * reduction_ratio), activation=tf.nn.relu)(avgpool_channel)
		mlp_2_avg = tf.keras.layers.Dense(units=hidden_num, activation=tf.nn.relu)(mlp_1_avg)
		#[批量维度、1、1、channel维度]，此处的隐藏层的神经元数量hidden_num等于输入数据的通道数
		mlp_2_avg = tf.reshape(mlp_2_avg, [-1, 1, 1, hidden_num])
		#把“对全局池化(reduce_max)提取后的”特征和“对平均池化(reduce_mean)提取后的”特征进行求和，然后通过sigmoid激活归一化到0到1之间	
		channel_attention = tf.nn.sigmoid(mlp_2_max + mlp_2_avg)
		#把“通过sigmoid激活归一化的”值和Channel Attention 模块的输入数据进行内积计算，其最终计算结果值作为后面的Spatial Attention 模块的输入
		channel_refined_feature = input_xs * channel_attention

		### 第二步：Spatial Attention 模块 ###
		#1.首先把Channel Attention 模块的输出作为Spatial Attention 模块的输入，然后对输入数据分别进行全局池化(reduce_max)和平均池化(reduce_mean)两种操作。
		#2.默认参数为keepdims=False的话，不会再保留运算所在的维度。设置keepdims=True的话，会保留运算所在的维度为1。
		#3.全局池化(reduce_max)：把[批量维度、高、宽、channel维度]转换为[批量维度、高、宽、1]
		maxpool_spatial = tf.reduce_max(channel_refined_feature, axis=3, keepdims=True)
		#平均池化(reduce_mean)：把[批量维度、高、宽、channel维度]转换为[批量维度、高、宽、1]
		avgpool_spatial = tf.reduce_mean(channel_refined_feature, axis=3, keepdims=True)
		#把全局池化(reduce_max)和平均池化(reduce_mean)后的数据在channel维度上进行合并，最终转换为[批量维度、高、宽、2]
		max_avg_pool_spatial = tf.concat([maxpool_spatial, avgpool_spatial], axis=3)
		#目的是将数据的维度降维为1，以便后面的sigmoid激活归一化计算
		conv_layer = tf.keras.layers.Conv2D(filters=1, kernel_size=(3, 3), padding="same", activation=None)(max_avg_pool_spatial)
		#通过sigmoid激活归一化到0到1之间
		spatial_attention = tf.nn.sigmoid(conv_layer)
		#将Spatial Attention 模块的输出channel_refined_feature 和Spatial Attention 模块的输出spatial_attention 进行内积计算，
		#其最终计算结果值作为CBAM注意力机制模块的输出值。
		refined_feature = channel_refined_feature * spatial_attention
		#将CBAM注意力机制模块的输入值input_xs和输出值refined_feature进行求和得出最终的结果
		output_layer = refined_feature + input_xs
		return output_layer

	# 加载了CBAM模块的ResNet
	def identity_block(input_xs, out_dim, with_shortcut_conv_BN=False):
		if with_shortcut_conv_BN:
			pass
		else:
			#返回与input的形状和内容均相同的张量，即shortcut等同于input_xs
			shortcut = tf.identity(input_xs)
		#input输入的channel数
		input_channel = input_xs.get_shape().as_list()[-1]
		#如果输入的channel数不等于输出的channel数的话
		if input_channel != out_dim:
			#求输出的channel数减去输入的channel数的绝对值，作为pad填充值
			pad_shape = tf.abs(out_dim - input_channel)
			#name="padding"表示给该填充操作赋予名称为"padding"。使用了默认参数mode='CONSTANT'和constant_values=0，表示填充默认值0。
			#第二个参数为paddings填充的形状：即分别的批量维度、高、宽的维度上都不作填充，在channel维度上填充pad_shape//2的数量。
			shortcut = tf.pad(shortcut, [[0, 0], [0, 0], [0, 0], [pad_shape // 2, pad_shape // 2]], name="padding")
		#残差卷积块中的3个Conv2D卷积的卷积核大小分别为1x1、3x3、1x1
		conv = tf.keras.layers.Conv2D(filters=out_dim // 4, kernel_size=1, padding="SAME", activation=tf.nn.relu)(input_xs)
		conv = tf.keras.layers.BatchNormalization()(conv)
		conv = tf.keras.layers.Conv2D(filters=out_dim // 4, kernel_size=3, padding="SAME", activation=tf.nn.relu)(conv)
		conv = tf.keras.layers.BatchNormalization()(conv)
		conv = tf.keras.layers.Conv2D(filters=out_dim // 4, kernel_size=1, padding="SAME", activation=tf.nn.relu)(conv)
		conv = tf.keras.layers.BatchNormalization()(conv)
		conv = tf.layers.conv2d(conv, out_dim, [1, 1], strides=[1, 1], kernel_initializer=tf.variance_scaling_initializer,
				      bias_initializer=tf.zeros_initializer, name="conv{}_2_1x1".format(str(layer_depth)))
		conv = tf.layers.batch_normalization(conv)
		# ResNet中加载的CBAM模块
		conv = cbam_module(conv)
		#残差连接：对残差的原始输入shortcut(即input_xs) 与 CBAM模块输出值conv进行求和
		output_ys = shortcut + conv
		output_ys = tf.nn.relu(output_ys)
	return output_ys

加载SENet模块的ResNet-50
	def expand_dim_backend(self,x):
		x1 = K.reshape(x,(-1,1,256))
		print('x1:',x1)
		return x1

	def multiply(self,a):
		x = np.multiply(a[0], a[1])
		print('x:',x)
		return x

	def make_net_Res(self, encoding):
		# Input
		x = ZeroPadding1D(padding=3)(encoding)
		x = Conv1D(filters=64, kernel_size=7, strides=2, padding='valid', activation='relu')(x)
		x = BatchNormalization(axis=1, scale=True)(x)
		x_pool = MaxPooling1D(pool_size=3, strides=2, padding='same')(x)
 
	 	#RESNet_1
		x = Conv1D(filters=128, kernel_size=1, strides=1, padding='valid', activation='relu')(x_pool)
		x = BatchNormalization(axis=1, scale=True)(x)
		x = Conv1D(filters=128, kernel_size=3, strides=1, padding='valid', activation='relu')(x)
		x = BatchNormalization(axis=1, scale=True)(x)
		RES_1 = Conv1D(filters=256, kernel_size=1, strides=1, padding='valid', activation='relu')(x)
		x = BatchNormalization(axis=1, scale=True)(RES_1)

		# SENet
		squeeze = GlobalAveragePooling1D()(x)
		squeeze = Lambda(self.expand_dim_backend)(squeeze)
		excitation = Conv1D(filters=16, kernel_size=1, strides=1, padding='valid', activation='relu')(squeeze)
		excitation = Conv1D(filters=256, kernel_size=1, strides=1, padding='valid', activation='sigmoid')(excitation)
		x_pool_1 = Conv1D(filters=256, kernel_size=1, strides=1, padding='valid', activation='relu')(x_pool)
		x_pool_1 = BatchNormalization(axis=1, scale=True)(x_pool_1)

		#multiply元素乘法：SENet模块输出值excitation 和 残差卷积输出RES_1(即SENet模块输入值RES_1)
		scale = Lambda(self.multiply)([RES_1, excitation])
		res_1 = Concatenate(axis=1)([x_pool_1, scale])

		#RESNet_2
		x = Conv1D(filters=128, kernel_size=1, activation='relu')(res_1)
		x = BatchNormalization(axis=1, scale=True)(x)
		x = Conv1D(filters=128, kernel_size=3,  activation='relu')(x)
		x = BatchNormalization(axis=1, scale=True)(x)
		RES_2 = Conv1D(filters=256, kernel_size=1)(x)

		# SENet
		squeeze = GlobalAveragePooling1D()(RES_2)
		squeeze = Lambda(self.expand_dim_backend)(squeeze)
		excitation = Conv1D(filters=16, kernel_size=1, strides=1, padding='valid', activation='relu')(squeeze)
		excitation = Conv1D(filters=256, kernel_size=1, strides=1, padding='valid', activation='sigmoid')(excitation)
	
		#multiply元素乘法：SENet模块输出值excitation 和 残差卷积输出RES_2(即SENet模块输入值RES_2)
		scale = Lambda(self.multiply)([RES_2, excitation]) 
		x = Concatenate(axis=1)([res_1, scale])
		x = GlobalMaxPooling1D()(x)
		print('x:', x)
		output = Dense(1, activation='sigmoid')(x)
		return (output)

あずにゃん

关注

7
点赞
踩
34

收藏

觉得还不错? 一键收藏
打赏
3
评论
注意力机制 SENet、CBAM

注意力机制 SENet、CBAM机器翻译 MXNet（使用含注意力机制的编码器—解码器，即 Encoder编码器-Decoder解码器框架 + Attention注意力机制）nltk 处理文本卷积神经网络处理文本：word2vec、TF-IDF、TextRank、字符卷积、词卷积、卷积神经网络文本分类模型的实现(Conv1D一维卷积、Conv2D二维卷积)反向传播、链式求导...
复制链接

扫一扫