关键点模型算法:
简介:服装关键点检测算法的基本思路是:输入服装图片,经过网络,输出关键点集合。
首先:思考
- 思考,对衣服(物体)做了精确的标记,我们是不是就可以通过画线准确分割出衣服(物体)了呢?
- 是不是也可以轻易抠出图中的衣服(物体)了呢?
- 还能抠出图像做其他事情,形变,膨胀,拉伸。。。
- 那么,对于算法模型,是怎么得出关键点的呢?或者说,模型学习了哪些经验就可以自己判断输出关键点位置了呢?
- 模型结构:输入,输出,中间层设计应该考虑些什么问题?
以下内容将简述模型设计,网络结构可测试效果。
数据IO:
输入代标记图像x
输出标记结果:Y
由此训练出关键点检测模型。
模型创建:
算法整体流程:
model = CoordRegressionNetwork(n_locations=24)
在这里,算法模型封装为一个数值坐标回归类CoordRegressionNetwork
类中:
class CoordRegression(nn.Module):
def __init__(self, n_locations):
super().__init__()
self.fcn = KeyPointsModel()
self.hm_conv = nn.Conv2d(24, n_locations, kernel_size=1, bias=False)
def forward(self, images):
# run images trough FCN
fcn_out = self.fcn(images)
# use a 1x1 conv to get one un_normalized heat-map per location
unnormalized_heatmaps = self.hm_conv(fcn_out)
# normalize the heatmaps
heatmaps = dsntnn.flat_softmax(unnormalized_heatmaps)
# calculate the coordinates
coords = dsntnn.dsnt(heatmaps)
return coords, heatmaps
根据def forward(self, images)
函数可以看出,模型的输入是图像,输出是坐标位置coords
和热图heatmaps
。
CoordRegressionNetwork
类封调用了两个模型:一个图像坐标关键点特征信息提取网络KeyPointsModel
生成热图;一个坐标回归网络dsntnn
,用于生成关键点坐标集合和每个点检点坐标组成的热图。
坐标关键点特征信息提取网络KeyPointsModel
subroutine1:子模块
class KeyPointsModel(nn.Module):
def __init__(self):
super(KeyPointsModel, self).__init__()
# these layers have no relu layer
no_relu_layers = ['conv6_2_CPM', 'Mconv7_stage2', 'Mconv7_stage3',
'Mconv7_stage4', 'Mconv7_stage5', 'Mconv7_stage6']
self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0, return_indices=True)
self.maxunpool = nn.MaxUnpool2d(2, stride=2)
# stage 1
block1_0_0 = OrderedDict([
('conv1_1', [3, 64, 3, 1, 1]),
('conv1_2', [64, 64, 3, 1, 1]),
])
block1_0_1 = OrderedDict([
('conv2_1', [64, 128, 3, 1, 1]),
('conv2_2', [128, 128, 3, 1, 1]),
])
block1_0_2 = OrderedDict([
('conv3_1', [128, 256, 3, 1, 1]),
('conv3_2', [256, 256, 3, 1, 1]),
('conv3_3', [256, 256, 3, 1, 1]),
('conv3_4', [256, 256, 3, 1, 1]),
])
block1_0_3 = OrderedDict([
('conv4_1', [256, 512, 3, 1, 1]),
('conv4_2', [512, 512, 3, 1, 1]),
('conv4_3', [512, 512, 3, 1, 1]),
('conv4_4', [512, 512, 3, 1, 1]),
('conv5_1', [512, 512, 3, 1, 1]),
('conv5_2', [512, 512, 3, 1, 1]),
('conv5_3_CPM', [512, 128, 3, 1, 1])
])
block1_1 = OrderedDict([
('conv6_1_CPM', [128, 512, 1, 1, 0]),
('conv6_2_CPM', [512, 24, 1, 1, 0])
])
blocks = {}
blocks['block1_0_0'] = block1_0_0
blocks['block1_0_1'] = block1_0_1
blocks['block1_0_2'] = block1_0_2
blocks['block1_0_3'] = block1_0_3
blocks['block1_1'] = block1_1
# stage 2-6
for i in range(2, 7):
blocks['block%d' % i] = OrderedDict([
('Mconv1_stage%d' % i, [152, 128, 7, 1, 3]),
('Mconv2_stage%d' % i, [128, 128, 7, 1, 3]),
('Mconv3_stage%d' % i, [128, 128, 7, 1, 3]),
('Mconv4_stage%d' % i, [128, 128, 7, 1, 3]),
('Mconv5_stage%d' % i, [128, 128, 7, 1, 3]),
('Mconv6_stage%d' % i, [128, 128, 1, 1, 0]),
('Mconv7_stage%d' % i, [128, 24, 1, 1, 0])
])
for k in blocks.keys():
blocks[k] = make_layers(blocks[k], no_relu_layers)
self.model1_0_0 = blocks['block1_0_0']
self.model1_0_1 = blocks['block1_0_1']
self.model1_0_2 = blocks['block1_0_2']
self.model1_0_3 = blocks['block1_0_3']
self.model1_1 = blocks['block1_1']
self.model2 = blocks['block2']
self.model3 = blocks['block3']
self.model4 = blocks['block4']
self.model5 = blocks['block5']
self.model6 = blocks['block6']
def forward(self, x):
# block0
out1_0_0 = self.model1_0_0(x)
output, indices = self.maxpool(out1_0_0)
output_un_pool = self.maxunpool(output, indices)
out1_0_1 = self.model1_0_1(output_un_pool)
output, indices = self.maxpool(out1_0_1)
output_un_pool = self.maxunpool(output, indices)
out1_0_2 = self.model1_0_2(output_un_pool)
output, indices = self.maxpool(out1_0_2)
output_un_pool = self.maxunpool(output, indices)
out1_0 = self.model1_0_3(output_un_pool)
# block1
out1_1 = self.model1_1(out1_0)
concat_stage2 = torch.cat([out1_1, out1_0], 1)
out_stage2 = self.model2(concat_stage2)
concat_stage3 = torch.cat([out_stage2, out1_0], 1)
out_stage3 = self.model3(concat_stage3)
concat_stage4 = torch.cat([out_stage3, out1_0], 1)
out_stage4 = self.model4(concat_stage4)
concat_stage5 = torch.cat([out_stage4, out1_0], 1)
out_stage5 = self.model5(concat_stage5)
concat_stage6 = torch.cat([out_stage5, out1_0], 1)
out_stage6 = self.model6(concat_stage6)
return out_stage6
网络结构的层级数量可以根据精度和效率,适当调整。如果代码比较抽象,可以对比看看下面的流程图。每一层的输入,输出,通道数信息也有标注。
关于此处subroutine1网络的直观图详细如下
我贴网络层详细设计是为了帮助自己强化学习设计思想,有助于直观地理解特征提取过程,以及AI学习是怎么学的。哪些是学习记忆的单元,以及都存储在什么地方的?
数值坐标回归网络DSNTNN
def forward(self, images):
# run images trough FCN
fcn_out = self.fcn(images)
# use a 1x1 conv to get one un_normalized heat-map per location
unnormalized_heatmaps = self.hm_conv(fcn_out)
# normalize the heatmaps
heatmaps = dsntnn.flat_softmax(unnormalized_heatmaps)
# calculate the coordinates
coords = dsntnn.dsnt(heatmaps)
return coords, heatmaps
需要预先安装:dsntnn
unnormalized_heatmaps = self.hm_conv(fcn_out)
中的hm_conv
卷积层定义为:self.hm_conv = nn.Conv2d(24, n_locations, kernel_size=1, bias=False)
其中的n_locations
为本衣服模型要输出的关键点个数。
# use a 1x1 conv to get one un_normalized heat-map per location
unnormalized_heatmaps = self.hm_conv(fcn_out)
生成粗略的特征图热图后,结果通过dsntnn.flat_softmax
生成标准的热图。
热图回归为数值坐标coords = dsntnn.dsnt(heatmaps)
,生成对应个数n_locations
的关键点集coords
。
测试结果:
提示:这里统计学习计划的总量
测试类型:
trousers,7个关键点
skirt:4个关键点
outwear:15个关键点
dress:15个关键点
blouse:13个关键点
资源
论文:Key Points Detection Algorithm of Object Based on Full Convolution Network
项目代码:我的码云:https://gitee.com/rpr/key-points-detection-method-with-torch
项目需安装dsntnn包
训练数据集和测试数据集:链接:https://pan.baidu.com/s/1ZdmzHcJA9FbNDpZF1bm4Hg
提取码:ps4v
训练好的KPDEM_model.zip包括用关键点模型算法(Keypoints Detection)训练服装数据后,得到的裤子、短裙、外套,大衣,dress的关键点检测模型。
需要数据集等其他信息,请给我留言,并附上邮箱信息。