基于部分亲和字段PAF(Part Affinity Field)的2D图像姿态估计(openpose)

最新推荐文章于 2023-12-27 21:37:39 发布

watersink

最新推荐文章于 2023-12-27 21:37:39 发布

阅读量1.9w

点赞数 16

分类专栏：姿态估计

本文链接：https://blog.csdn.net/qq_14845119/article/details/72526737

版权

姿态估计专栏收录该内容

11 篇文章 4 订阅

订阅专栏

该文章出自2017年的CVPR，Realtime Multi-Person 2D Pose Estimation using Part Affinity Field，是CMU的工作，效果真的amazing。

也许这篇文章的亮点在于，融合了PCM和PAF的级联cascade形网络结构，网络设计思想和RefineNet的网络设计思想很像，以及相应条件约束的偶匹配（bipartite matchings）算法。

整个检测过程如上图所示，输入一幅图像，然后经过7个stage，得到PCM和PAF。然后根据PAF生成一系列的偶匹配，由于PAF自身的矢量性，使得生成的偶匹配很正确，最终合并为一个人的整体骨架。

在models/pose/mpi/下面有2个模型，pose_deploy_linevec.prototxt模型相对更准确一些，pose_deploy_linevec_faster_4_stages.prototxt模型精度稍微有所下降，大概2个点，但是速度提升有30%。

这2个模型的区别就是，后者去掉了stage5，stage6这2个卷积模块（如下图所示，每个stage由一系列的卷积层组成，其中Branch1由1个列组成，Branch2由2个列组成，在进入下一个stage前，上一个stage的3个列进行融合）。

models/pose/coco/pose_deploy_linevec.prototxt和models/pose/mpi/pose_deploy_linevec.prototxt网络结构一样，区别就是coco的卷基层滤波器数目更多点。这个模型是3个模型中最精确的一个。

模型的设置可以在，examples/openpose/openpose.cpp中设置，默认调用COCO的model

DEFINE_string(model_pose,"COCO","Modelto be used (e.g. COCO, MPI, MPI_4_layers).");

安装步骤（环境centos）：

git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose.git
cd 3rdparty/caffe
cp Makefile.config.Ubuntu14.example Makefile.config #修改其中的路径为自己计算机路径
make all -j8

cd ../../models/
./getModels.sh 
cd ..
cp Makefile.config.Ubuntu14.example Makefile.config #修改其中的路径为自己计算机路径
make -j8

可能错误：

/tmp/cciLUahT.s:1660: Error: no suchinstruction: `vextracti128 $0x1,%ymm0,%xmm0'

make: ***[.build_release/src/openpose/gui/guiInfoAdder.o] 错误 1

make: *** 正在等待未完成的任务....

/tmp/ccpTDNgT.s:3892: Error: no suchinstruction: `vextracti128 $0x1,%ymm0,%xmm0'

make: ***[.build_release/src/openpose/gui/gui.o] 错误 1

解决方法：注释掉Makefile中的汇编优化，204行，CXXFLAGS += -march=native

可能错误：

ERROR: something wrong with flag'tab_completion_word' in file

解决方法：

去掉Makefile中144行，

LIBRARIES += glog gflags boost_systemboost_filesystem m hdf5_hl hdf5 caffe

中的gflags

测试：

跑视频：

./build/examples/openpose/openpose.bin--video examples/media/video.avi

跑摄像头：

./build/examples/openpose/openpose.bin

跑图片：

./build/examples/openpose/openpose.bin--image_dir examples/media/

pets数据集上的测试效果：

openpose训练

训练代码：

https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation

训练需要使用的caffe

https://github.com/CMU-Perceptual-Computing-Lab/caffe_train

需要做如下修改

（1）delete opencv_contrib in Makefile
LIBRARIES += opencv_core opencv_highgui opencv_imgproc (delete opencv_contrib)

（2）remove the inlucde header
./src/caffe/cpm_data_transformer.cpp
#include <opencv2/contrib/contrib.hpp>

matlab的coco工具箱

注意需要使用linux环境，windows环境没有对应mex不支持

https://github.com/cocodataset/cocoapi

训练：

cd Realtime_Multi-Person_Pose_Estimation-master/training

bash getData.sh

/usr/local/MATLAB/R2017a/bin/matlab -nojvm -nodesktop -nodisplay -r getANNO

/usr/local/MATLAB/R2017a/bin/matlab -nojvm -nodesktop -nodisplay -r genCOCOMask

/usr/local/MATLAB/R2017a/bin/matlab -nojvm -nodesktop -nodisplay -r genJSON

python genLMDB.py

python setLayers.py --exp 1



cd dataset/COCO/COCO_kpt/pose56/exp22

bash train_pose.sh 0,1

网络训练结构图：

genJSON函数生成的json格式：

{
	"root": [{
		"dataset": "COCO",#数据集，string
		"isValidation": 0.000,#是否可见，0可见，1不可见，float
		"img_paths": "train2014/COCO_train2014_000000000308.jpg",#图像路径,string
		"img_width": 640.000,#图像宽度，float
		"img_height": 426.000,#图像高度,float
		"objpos": [201.540, 226.370],#物体中心坐标，float
		"image_id": 308.000,#图片ID，float
		"bbox": [134.680, 28.650, 133.720, 395.440],#物体边框,float
		"segment_area": 23904.367,#物体区域面积，float
		"num_keypoints": 15.000,#可见的关键点个数，float
		"joint_self": [#中心物体的关键点坐标，x,y,label，float
#label=0：可见，正常，不需裁剪
#label=1：存在不可见部分，有遮挡occluded
#label=2：身体部分被裁剪掉，cropped
			[209.000, 82.000, 1.000],
			[217.000, 74.000, 1.000],
			[198.000, 73.000, 1.000],
			[220.000, 64.000, 1.000],
			[180.000, 68.000, 1.000],
			[227.000, 118.000, 1.000],
			[152.000, 123.000, 1.000],
			[0.000, 0.000, 2.000],
			[159.000, 195.000, 0.000],
			[0.000, 0.000, 2.000],
			[208.000, 196.000, 1.000],
			[235.000, 245.000, 1.000],
			[190.000, 254.000, 1.000],
			[252.000, 320.000, 0.000],
			[195.000, 346.000, 1.000],
			[250.000, 381.000, 1.000],
			[200.000, 422.000, 1.000]
		],
		"scale_provided": 1.075,#物体边框的高度/368，float
		"joint_others": [#其他物体，不在图片中心的物体的关键点坐标
			[
				[52.000, 135.000, 1.000],
				[63.000, 123.000, 1.000],
				[34.000, 119.000, 1.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[11.000, 209.000, 0.000],
				[0.000, 0.000, 2.000],
				[43.000, 332.000, 1.000],
				[0.000, 0.000, 2.000],
				[129.000, 251.000, 1.000],
				[0.000, 0.000, 2.000],
				[59.000, 404.000, 1.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[115.000, 75.000, 1.000],
				[117.000, 64.000, 1.000],
				[106.000, 63.000, 1.000],
				[0.000, 0.000, 2.000],
				[71.000, 58.000, 1.000],
				[0.000, 0.000, 2.000],
				[52.000, 141.000, 0.000],
				[0.000, 0.000, 2.000],
				[75.000, 226.000, 0.000],
				[0.000, 0.000, 2.000],
				[135.000, 265.000, 0.000],
				[0.000, 0.000, 2.000],
				[90.000, 274.000, 0.000],
				[0.000, 0.000, 2.000],
				[88.000, 390.000, 0.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[142.000, 98.000, 1.000],
				[142.000, 96.000, 1.000],
				[138.000, 96.000, 1.000],
				[0.000, 0.000, 2.000],
				[122.000, 99.000, 1.000],
				[141.000, 122.000, 0.000],
				[116.000, 127.000, 0.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[448.000, 98.000, 1.000],
				[453.000, 84.000, 1.000],
				[0.000, 0.000, 2.000],
				[489.000, 74.000, 1.000],
				[0.000, 0.000, 2.000],
				[523.000, 143.000, 1.000],
				[480.000, 126.000, 1.000],
				[517.000, 231.000, 1.000],
				[456.000, 178.000, 1.000],
				[442.000, 248.000, 1.000],
				[409.000, 191.000, 1.000],
				[503.000, 295.000, 1.000],
				[472.000, 278.000, 1.000],
				[481.000, 392.000, 1.000],
				[461.000, 375.000, 1.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[61.000, 360.000, 1.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[361.000, 92.000, 1.000],
				[365.000, 86.000, 1.000],
				[0.000, 0.000, 2.000],
				[378.000, 86.000, 0.000],
				[0.000, 0.000, 2.000],
				[383.000, 107.000, 0.000],
				[360.000, 105.000, 1.000],
				[0.000, 0.000, 2.000],
				[351.000, 130.000, 1.000],
				[0.000, 0.000, 2.000],
				[349.000, 147.000, 1.000],
				[380.000, 157.000, 1.000],
				[361.000, 158.000, 1.000],
				[381.000, 204.000, 0.000],
				[363.000, 206.000, 1.000],
				[382.000, 241.000, 0.000],
				[366.000, 240.000, 0.000]
			]
		],
		"annolist_index": 15.000,#json的索引，float
		"people_index": 1.000,#人物的索引,float
		"numOtherPeople": 6.000,#除去中心的人，其他人的个数,float
		"scale_provided_other": [1.029, 0.723, 0.239, 1.076, 0.502, 0.479],#其他人的高度/368，float
		"objpos_other": [#其他人的中心坐标，float
			[92.540, 236.655],
			[127.225, 148.285],
			[128.435, 122.660],
			[468.990, 221.455],
			[59.830, 328.830],
			[370.745, 161.240]
		],
		"bbox_other": [#其他人的边框坐标，x,y,width,height,float
			[0.000, 47.310, 185.080, 378.690],
			[45.740, 15.340, 162.970, 265.890],
			[111.180, 78.760, 34.510, 87.800],
			[381.090, 23.470, 175.800, 395.970],
			[0.000, 236.450, 119.660, 184.760],
			[345.800, 73.100, 49.890, 176.280]
		],
		"segment_area_other": [28055.014, 15191.540, 1602.758, 36639.449, 8737.961, 3366.046],#其他人的分割的区域面积，float
		"num_keypoints_other": [7.000, 9.000, 6.000, 13.000, 1.000, 13.000]#其他人的可见关键点个数，float
	}]
}

可视化程序：

import json
import cv2

with open('coco1.json', 'r') as f:
    coco_json = json.load(f)

image =cv2.imread("COCO_train2014_000000000308.jpg",1)


for ann in coco_json["root"]:
    box = ann["bbox"]#x,y,width,height
    cv2.rectangle(image,(int(box[0]),int(box[1])),(int(box[0]+box[2]),int(box[1]+box[3])),(0,255,255),1)
    box_other =ann["bbox_other"]
    for b_o in box_other:
        cv2.rectangle(image,(int(b_o[0]),int(b_o[1])),(int(b_o[0]+b_o[2]),int(b_o[1]+b_o[3])),(0,0,255),1)

    joint_self=ann["joint_self"]
    for joint in joint_self:
        if (joint[2]==0):
            cv2.circle(image,(int(joint[0]),int(joint[1])),1,(255,0,0),1)
        if (joint[2]==1):
            cv2.circle(image,(int(joint[0]),int(joint[1])),1,(0,255,0),2)
    others=ann["joint_others"]
    for other in others:
        for oth in other:
            if (oth[2]==0):
                cv2.circle(image,(int(oth[0]),int(oth[1])),1,(255,0,0),1)
            if (oth[2]==1):
                cv2.circle(image,(int(oth[0]),int(oth[1])),1,(0,255,0),2)



cv2.imshow("image",image)
cv2.waitKey()

结果：

json转化为mask:

# -*- coding: UTF-8 -*-
import numpy as np
import cv2
import math
import os
import json
import matplotlib.pyplot as plt
import shutil

def points2cont(point_lists):
    new_cnts = []
    for point_list in point_lists:
        cont = np.array(point_list, dtype=np.int32)
        cont = cont.reshape((-1, 1, 2))
        new_cnts.append(cont)
    return new_cnts


def creat_mask(cnts,size=(3024, 4032)):
    mask = np.zeros(size, dtype=np.uint8)
    # cnt = points2cont(cnt)
    i=1
    for cnt in cnts:
        cv2.drawContours(mask, cnt, -1, [255], -1)
        i+=1
    # plt.imshow(mask)
    # plt.show()
    return mask



def load_point_list(json_file):
    jsons = json.load(open(json_file,'r'))
    point_lists = []
    for plist in jsons['shapes']:
        point_lists.append(plist)
    return point_lists


def select_one_class_cnt(point_lists,label):
    max_cnt = []
    max_size = 0
    max_label = 0
    plists = []
    for p in point_lists:
        point_list = p['points']
        #if p['label'] == label:
        plists.append(point_list)

    if len(plists) > 0:
        cnts = points2cont(plists)
        return cnts

    else:
        return []





def visual_contour(img,cnts):
    imshow_img = img.copy()
    cv2.drawContours(imshow_img, cnts, -1, [0,0,255], 2)

    return imshow_img


def main(imgs_dir,labels_dir,masks_dir,labels):
    files = os.listdir(imgs_dir)

    for file in files:
        if '.jpg' in file:
            prefix = '.jpg'
        elif '.JPG' in file:
            prefix = '.JPG'
        else:
            prefix = '.png'

        # print file
        img_file = os.path.join(imgs_dir,file)
        print (img_file)

        img = cv2.imread(img_file,2)

        h,w = img.shape
        json_file = os.path.join(labels_dir,file.replace('.jpg','.json'))

        if not os.path.exists(json_file):
            print (json_file," not exist")
            continue

        point_list = load_point_list(json_file)

        cnts = []
        for i in range(len(labels)):
            cnts.append(select_one_class_cnt(point_list,labels[i]))

        if len(cnts)==0:
            print(0)
            continue
        mask = creat_mask(cnts, (h,w))
        depth_file = file[:-4]+'.png'
        color_ff = file[:-4]+'.jpg'

        cv2.imwrite(os.path.join(masks_dir,depth_file),mask,[cv2.IMWRITE_PNG_COMPRESSION,0])


if __name__ == '__main__':
    imgs_dir = r"/images/train1"#原图路径
    labels_dir = r"/images/segments1"#labelme标注的json路径
    masks_dir = r"/images/masks"#生成的mask路径
    labels = ["yy"]#label名称
    main(imgs_dir,labels_dir,masks_dir,labels)

预训练模型链接：

https://download.csdn.net/download/qq_14845119/11616965

reference:

https://github.com/CMU-Perceptual-Computing-Lab/openpose

watersink

关注

16
点赞
踩
63

收藏

觉得还不错? 一键收藏
7
评论
基于部分亲和字段PAF(Part Affinity Field)的2D图像姿态估计(openpose)

该文章出自2017年的CVPR，Realtime Multi-Person 2D Pose Estimation using Part Affinity Field，是CMU的工作，效果真的amazing。也许这篇文章的亮点在于，融合了PCM和PAF的级联cascade形网络结构，网络设计思想和RefineNet的网络设计思想很像，以及相应条件约束的偶匹配（bipartite match...
复制链接

扫一扫