DANet中的Position和Channel Attention Module个人经验整理

最新推荐文章于 2024-05-01 10:38:21 发布

森尼嫩豆腐

最新推荐文章于 2024-05-01 10:38:21 发布

阅读量4.4k

点赞数 5

分类专栏：代码实现文章标签： python pytorch cv attention

本文链接：https://blog.csdn.net/lavinia_chen007/article/details/113182827

版权

代码实现专栏收录该内容

14 篇文章 6 订阅

订阅专栏

文章目录

工作参考借鉴了Dual Attention Network for Scene Segmentation中的Position Attention Module(PAM)空间位置注意力模块和Channel Attention Module(CAM)通道注意力模块。
arXiv：点这里
official code：点这里
模型code：点这里
在学习和使用中一些疑惑和找到的相应解答整理如下。

文章中，两个模块的详情示意图
在这里插入图片描述
文章中，对Channel Attention Module中有这样一段

Noted that we do not employ convolution layers to embed features before computing relationshoips of two chan nels, since it can maintain relationship between different channel maps.

在下面的Code中可以特别注意一下PAM和CAM实现的不同之处。

关于两个模块最终的output，文章中表示是通过以下方式产生：
PAM: $E_j = \alpha\sum_{i=1}^N(s_{ji}D_i) + A_j$
CAM: $E_j = \beta\sum_{i=1}^C(x_{ji}A_i) + A_j$
其中， $\alpha$ 和 $\beta$ 文中表示为

is initialized as 0 and gradually learns to assign more weight

此处在读文章的时候，不知道该如何实现，后来看了Code豁然开朗。

DANet中的Position Attention Module

class PAM_Module(nn.Module):
	"""Position attention module"""
	# Ref from SAGAN
	def __init__(self, in_dim):
		super(PAM_Module, self).__init__()
		self.channel_in = in_dim

		self.query_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim//8,kernel_size=1)
		self.key_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim//8,kernel_size=1)
		self.value_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim,kernel_size=1)
		self.gamma = nn.Parameter(torch.zeros(1)) #注意此处对$\alpha$是如何初始化和使用的

		self.softmax = nn.Softmax(dim=-1)

	def forward(self, x):
		"""
			inputs:
				x : input feature maps
			returns:
				out:attention value + input feature
				attention: B * (H*W) * (H*W)
		"""
		m_batchsize, C, height, width = x.size()
		proj_query = self.query_conv(x).view(m_batchsize, -1, width*height).permute(0,2,1)
		proj_key = self.key_conv(x).view(m_batchsize, -1, width*height)
		energy = torch.bmm(proj_query,proj_key)
		attetion = self.softmax(energy)
		proj_value = self.value_conv(x).view(m_batchsize, -1, width*height)

		out = torch.bmm(proj_value,attention.permute(0,2,1))
		out = out.view(m_batchsize, C, height, width)

		out = self.gamma*out + x

		return out

DAN中的Channel Attetnion Module

class CAM_Module(nn.Module):
	"""Channel attention module"""
	def __init__(self, in_dim):
		super(CAM_Module, self).__init__()
		self.channel_in = in_dim
		# CAM和PAM相比是没有Conv2d层的
		self.gamma = nn.Parameter(torch.zeros(1)) #注意此处对$\beta$是如何初始化和使用的
		self.softmax = nn.Softmax(dim=-1)

	def forward(self, x):
		"""
			inputs:
				x : input feature maps
			returns:
				out:attention value + input feature
				attention: B * C * C
		"""
		m_batchsize, C, height, width = x.size()
		proj_query = x.view(m_batchsize, C, -1)
		proj_key = x.view(m_batchsize,C,-1).permute(0,2,1)
		energy = torch.bmm(proj_query, proj_key)
		energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy
		attention = self.softmax(energy_new)
		proj_value = x.view(m_batchsize, C, -1)

		out = torch.bmm(attention, proj_value)
		out = out.view(m_batchsize, C, heigh, width)

		out = self.gamma*out + x
		return out