日萌社
人工智能AI:Keras PyTorch MXNet TensorFlow PaddlePaddle 深度学习实战(不定时更新)
CNN:RCNN、SPPNet、Fast RCNN、Faster RCNN、YOLO V1 V2 V3、SSD、FCN、SegNet、U-Net、DeepLab V1 V2 V3、Mask RCNN
单目标跟踪 Siamese系列网络:SiamFC、SiamRPN、one-shot跟踪、one-shotting单样本学习、DaSiamRPN、SiamRPN++、SiamMask
1.4 model
学习目标:
- 了解SiamMask的网络架构
- 知道各个模块的构建
SiamMask网络的实现包含两部分:分别是SiamMask_base 和SiamMask_sharp。其中SiamMask_base是基础模块,与SiamMask_base模块相比,SiamMask_sharp中增加了进行掩膜细化的refine模块。网络结构如下图所示:
掩膜细化的模块如下图所示:
在experience中的custom.py作为网络结构的载体,现在我们看下其中的内容。
1.custom.py
custom是整个网络的载体,分别有siammask的基础网络和siammask_sharp的网络结构,其中增加了处理 mask的相关函数。
custom.py文件中的架构如下所示:
1.1 ResDowns
ResDowns表示在resNet50特征提取后面的adjust操作,仅在ResDown中调用,如下图中红框中所示:
代码实现如下所示:
class ResDownS(nn.Module):
"""
对应于网络中的adjust
"""
# inplane对应输入通道数,outplane对应输出通道数
def __init__(self, inplane, outplane):
super(ResDownS, self).__init__()
# adjust实现
self.downsample = nn.Sequential(
nn.Conv2d(inplane, outplane, kernel_size=1, bias=False),
nn.BatchNorm2d(outplane))
def forward(self, x):
# adjust操作
x = self.downsample(x)
# 图像宽度小于20的,只取中间的部分
if x.size(3) < 20:
l = 4
r = -4
x = x[:, :, l:r, l:r]
return x
1.2 ResDown
ResDown是网络的特征提取层,对应图中的ResNet50和adjust,如下图所示:
该模块的网络结构参数如下所示:
该模块实现包含resnet50和adjust:
代码实现如下:
class ResDown(MultiStageFeature):
"""
孪生网络特征提取层,对应ResNet-50和adjust 操作
"""
def __init__(self, pretrain=False):
super(ResDown, self).__init__()
# 利用resnet50进行特征提取
self.features = resnet50(layer3=True, layer4=False)
# 若存在预训练网络则将其直接赋值给feature
if pretrain:
load_pretrain(self.features, 'resnet.model')
# adjust
self.downsample = ResDownS(1024, 256)
# 网络层
self.layers = [self.downsample, self.features.layer2, self.features.layer3]
self.train_nums = [1, 3]
self.change_point = [0, 0.5]
self.unfix(0.0)
def param_groups(self, start_lr, feature_mult=1):
lr = start_lr * feature_mult
def _params(module, mult=1):
params = list(filter(lambda x:x.requires_grad, module.parameters()))
if len(params):
return [{'params': params, 'lr': lr * mult}]
else:
return []
groups = []
groups += _params(self.downsample)
groups += _params(self.features, 0.1)
return groups
def forward(self, x):
"""
前向传输,返回adjust结果
:param x:
:return:
"""
output = self.features(x)
p3 = self.downsample(output[-1])
return p3
def forward_all(self, x):
"""
前向传输,返回特征和adjust结果
:param x:
:return:
"""
output = self.features(x)
p3 = self.downsample(output[-1])
return output, p3
1.3 up
up(rpn)是边框回归和分类网络,实现过程调用DepthCorr对象,DepthCorr对象是在rpn.py中实现的,其逐通道进行相关性计算,得到相应的响应,根据该响应得到目标的分类结果检测位置。
该部分对应于下图中方框内部分:
网络参数为:
代码实现如下所示:
class UP(RPN):
"""
边界回归和分类网络
"""
def __init__(self, anchor_num=5, feature_in=256, feature_out=256):
super(UP, self).__init__()
# 参数设置
self.anchor_num = anchor_num
self.feature_in = feature_in
self.feature_out = feature_out
self.cls_output = 2 * self.anchor_num
self.loc_output = 4 * self.anchor_num
# 分类和回归结果
self.cls = DepthCorr(feature_in, feature_out, self.cls_output)
self.loc = DepthCorr(feature_in, feature_out, self.loc_output)
def forward(self, z_f, x_f):
"""
返回分类和回归结果
:param z_f:
:param x_f:
:return:
"""
cls = self.cls(z_f, x_f)
loc = self.loc(z_f, x_f)
return cls, loc
1.4 MaskCorr
mask分支网络,同样调用DepthCorr对象,输入为256,输出为63*63通道数:
网络结构如下红框中所示:
代码实现如下:
class MaskCorr(Mask):
"""
目标分割,利用DepthCorr完成
"""
def __init__(self, oSz=63):
super(MaskCorr, self).__init__()
self.oSz = oSz
self.mask = DepthCorr(256, 256, self.oSz**2)
def forward(self, z, x):
return self.mask(z, x)
1.5 refine
refine模块是掩膜细化模块,主要在siammasksharp中使用个,siammaskbase中没有使用,该模块主要用于目标的掩膜细化,如下图中红框所示:
具体到U2,U3,U4,我们以U3为例展示如下:蓝色底框中即为掩膜的改进模块:
代码实现如下所示:
首先是模块的初始化,在此完成了网络模型的构建:
def __init__(self):
"""
掩膜操作融合模块的实现
"""
super(Refine, self).__init__()
# self.v2、self.v1、self.v0为垂直分支(vertical),压缩通道;
self.v0 = nn.Sequential(nn.Conv2d(64, 16, 3, padding=1), nn.ReLU(),
nn.Conv2d(16, 4, 3, padding=1),nn.ReLU())
self.v1 = nn.Sequential(nn.Conv2d(256, 64, 3, padding=1), nn.ReLU(),
nn.Conv2d(64, 16, 3, padding=1), nn.ReLU())
self.v2 = nn.Sequential(nn.Conv2d(512, 128, 3, padding=1), nn.ReLU(),
nn.Conv2d(128, 32, 3, padding=1), nn.ReLU())
# self.h2、self.h1、self.h0作用于水平分支(horizontal),消化融合结果。
self.h2 = nn.Sequential(nn.Conv2d(32, 32, 3, padding=1), nn.ReLU(),
nn.Conv2d(32, 32, 3, padding=1), nn.ReLU())
self.h1 = nn.Sequential(nn.Conv2d(16, 16, 3, padding=1), nn.ReLU(),
nn.Conv2d(16, 16, 3, padding=1), nn.ReLU())
self.h0 = nn.Sequential(nn.Conv2d(4, 4, 3, padding=1), nn.ReLU(),
nn.Conv2d(4, 4, 3, padding=1), nn.ReLU())
# 由多个输入平面组成的输入图像上应用2D 转置卷积运算符。该模块可以看作 Conv2d 相对于其输入的梯度。它也被称为分数步长卷积或反卷积
self.deconv = nn.ConvTranspose2d(256, 32, 15, 15)
# post0,post1,post2属性分别对应U2,U3,U4
self.post0 = nn.Conv2d(32, 16, 3, padding=1)
self.post1 = nn.Conv2d(16, 4, 3, padding=1)
self.post2 = nn.Conv2d(4, 1, 3, padding=1)
# 卷积层使用kaiming分布初始化其参数
for modules in [self.v0, self.v1, self.v2, self.h2, self.h1, self.h0, self.deconv, self.post0, self.post1, self.post2,]:
for l in modules.modules():
if isinstance(l, nn.Conv2d):
nn.init.kaiming_uniform_(l.weight, a=1)
下面是前向传播和参数统计:
def forward(self, f, corr_feature, pos=None, test=False):
if test:
# 进行测试时:
# f为 ResNet 的特征图元组。
# f[0]形状为[1, 64, 125, 125],
# f[1]形状为[1, 256, 63, 63],
# f[2]形状为[1, 512, 31, 31],
# p0,p1,p2表示补0填充后,取出目标位置的特征图
p0 = torch.nn.functional.pad(f[0], [16, 16, 16, 16])[:, :, 4*pos[0]:4*pos[0]+61, 4*pos[1]:4*pos[1]+61]
p1 = torch.nn.functional.pad(f[1], [8, 8, 8, 8])[:, :, 2 * pos[0]:2 * pos[0] + 31, 2 * pos[1]:2 * pos[1] + 31]
p2 = torch.nn.functional.pad(f[2], [4, 4, 4, 4])[:, :, pos[0]:pos[0] + 15, pos[1]:pos[1] + 15]
else:
# 训练
# 利用滑动窗口取得特征图
p0 = F.unfold(f[0], (61, 61), padding=0, stride=4).permute(0, 2, 1).contiguous().view(-1, 64, 61, 61)
if not (pos is None): p0 = torch.index_select(p0, 0, pos)
p1 = F.unfold(f[1], (31, 31), padding=0, stride=2).permute(0, 2, 1).contiguous().view(-1, 256, 31, 31)
if not (pos is None): p1 = torch.index_select(p1, 0, pos)
p2 = F.unfold(f[2], (15, 15), padding=0, stride=1).permute(0, 2, 1).contiguous().view(-1, 512, 15, 15)
if not (pos is None): p2 = torch.index_select(p2, 0, pos)
if not(pos is None):
# 训练,P3相关特征上的特征向量
p3 = corr_feature[:, :, pos[0], pos[1]].view(-1, 256, 1, 1)
else:
# 测试
p3 = corr_feature.permute(0, 2, 3, 1).contiguous().view(-1, 256, 1, 1)
# 反卷积
out = self.deconv(p3)
# 进行特征的融和
out = self.post0(F.upsample(self.h2(out) + self.v2(p2), size=(31, 31)))
out = self.post1(F.upsample(self.h1(out) + self.v1(p1), size=(61, 61)))
out = self.post2(F.upsample(self.h0(out) + self.v0(p0), size=(127, 127)))
out = out.view(-1, 127*127)
return out
def param_groups(self, start_lr, feature_mult=1):
"""
参数集合
:param start_lr:
:param feature_mult:
:return:
"""
params = filter(lambda x:x.requires_grad, self.parameters())
params = [{'params': params, 'lr': start_lr * feature_mult}]
return params
1.6 custom
custom中完成可整个网络框架的构建,主要包括各个网络模块,跟踪及分割等。
class Custom(SiamMask):
def __init__(self, pretrain=False, **kwargs):
super(Custom, self).__init__(**kwargs)
self.features = ResDown(pretrain=pretrain)
self.rpn_model = UP(anchor_num=self.anchor_num, feature_in=256, feature_out=256)
self.mask_model = MaskCorr()
self.refine_model = Refine()
def refine(self, f, pos=None):
"""
特征融合
:param f:
:param pos:
:return:
"""
return self.refine_model(f, pos)
def template(self, template):
"""
对模板进行特征提取
:param template:
:return:
"""
self.zf = self.features(template)
def track(self, search):
"""
目标追踪
:param search:进行跟踪的图像块
:return: 目标分类和回归结果
"""
# 目标特征提取
search = self.features(search)
# 利用rpn网络进行回归与分类:得到目标类型及位置
rpn_pred_cls, rpn_pred_loc = self.rpn(self.zf, search)
return rpn_pred_cls, rpn_pred_loc
def track_mask(self, search):
"""
目标跟踪并进行分割
:param search: 进行跟踪的图像块
:return: 目标分类,回归及分割结果
"""
# 目标特征提取,
self.feature, self.search = self.features.forward_all(search)
# 分类和回归
rpn_pred_cls, rpn_pred_loc = self.rpn(self.zf, self.search)
# 相关滤波
self.corr_feature = self.mask_model.mask.forward_corr(self.zf, self.search)
# 掩膜结果
pred_mask = self.mask_model.mask.head(self.corr_feature)
return rpn_pred_cls, rpn_pred_loc, pred_mask
def track_refine(self, pos):
# 对特征进行融合
pred_mask = self.refine_model(self.feature, self.corr_feature, pos=pos, test=True)
return pred_mask
总结:
- siammask网络的框架中包括特征提取,RPN和Mask模块,在siammask_sharp中增加了refine模块
- siammask中custom是整个网络中承载体,特征提取使用resnet50作为基础网络