简介
person_keypoints_train2017.json和person_keypoints_val2017.json文件均分为五大部分,这五部分对应的关键字分别为info、licenses、images、annotations、categories。
{
"info" : info,
"licenses" : [license1, license2, license3, ...],
"images" : [image1, image2, image3, ...],
"annotations" : [annataton1, annataton2, annataton3, ...],
"categories" : [category1, category2, category3, ...]
}
images部分包含了图像信息,以列表形式存储,每张图像信息的存储形式是一样的:
images
{
"license": int 类型,表示该图像的liecens证书属于licenses部分中的哪一个证书,对应licenses部分中证书的id号
"file_name": string 类型,图片的文件名,比如000000000001.jpg
"coco_url": string 类型,coco图片链接url
"height": int 类型,图片的高
"width": int 类型,图片的宽
"date_captured": string 类型,图片的获取日期
"flickr_url": string 类型,flickr图片链接url
"id": int 类型,图片id,和annotations中的image_id相对应
}
annotations部分主要包含bbox和keypoints标注信息。keypoints是一个长度为3 ∗k的数组,其中k是category中keypoints的总数量。每一个keypoint是一个长度为3的数组,第一和第二个元素分别是x和y坐标值,第三个元素是个标志位v,v为0时表示这个关键点没有标注(这种情况下 x = y = v = 0),v为 1 时表示这个关键点标注了但是不可见(被遮挡了),v为2时表示这个关键点标注了的同时也可见。
{
"segmentation": [[125.12,539.69,140.94,522.43,100.67,496.54,84.85,469.21,73.35,450.52,104.99,342.65,168.27,290.88,179.78,288,189.84,286.56,191.28,260.67,202.79,240.54,221.48,237.66,248.81,243.42,257.44,256.36,253.12,262.11,253.12,275.06,299.15,233.35,329.35,207.46,355.24,206.02,363.87,206.02,365.3,210.34,373.93,221.84,363.87,226.16,363.87,237.66,350.92,237.66,332.22,234.79,314.97,249.17,271.82,313.89,253.12,326.83,227.24,352.72,214.29,357.03,212.85,372.85,208.54,395.87,228.67,414.56,245.93,421.75,266.07,424.63,276.13,437.57,266.07,450.52,284.76,464.9,286.2,479.28,291.96,489.35,310.65,512.36,284.76,549.75,244.49,522.43,215.73,546.88,199.91,558.38,204.22,565.57,189.84,568.45,184.09,575.64,172.58,578.52,145.26,567.01,117.93,551.19,133.75,532.49]],
"num_keypoints": 10, # 标注的关键点数量
"area": 47803.27955,
"iscrowd": 0, # 决定是RLE格式还是polygon格式
"keypoints": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,142,309,1,177,320,2,191,398,2,237,317,2,233,426,2,306,233,2,92,452,2,123,468,2,0,0,0,251,469,2,0,0,0,162,551,2], # 人体关键点尾缀信息
"image_id": 425226, # 等同于前面image字段里面的id。
"bbox": [73.35,206.02,300.58,372.5], # 标注框,x,y为标注框的左上角坐标
"category_id": 1,
"id": 183126
},
categories部分:
keypoints 是一个长度为k的数组,包含了每个关键点的名字;skeleton 定义了各个关键点之间的连接性(比如人的左手腕和左肘就是连接的,但是左手腕和右手腕就不是)。目前,COCO的keypoints只标注了person category(分类为人)。
信息提取
以person_keypoints_val2017.json的提取为例:
初始目录如下:
annotations/person_keypoints_val2017.json是5000张图片的标注信息
images下是5000张图片
import json
import os
json_path = "annotations/person_keypoints_val2017.json"
json_labels = json.load(open(json_path, "r"))
annotations = json_labels['annotations'] # list
images = json_labels['images'] # list
categories = json_labels['categories'] # list
存储图片名和图片id之间的映射关系:
idtoimage = {}
for image in images:
file_name = image['file_name']
image_id = image['id']
height = image['height']
width = image['width']
idtoimage[image_id] = [file_name,height,width]
存储类别名和类别id之间的映射关系。person此处对应1,需要用CLASSES转换一下
idtoclss = {}
for category in categories:
id = category['id']
name = category['name'] # 类别名
idtoclss[id] = name
CLASSES = ['person']
解析人体关键点并归一化:
2d:<class-index> <x> <y> <width> <height> <px1> <py1> <px2> <py2> ... <pxn> <pyn>
3d:<class-index> <x> <y> <width> <height> <px1> <py1> <p1-visibility> <px2> <py2> <p2-visibility> <pxn> <pyn> <p2-visibility>
Dim = 2
for annotation in annotations:
try:
count = annotation['num_keypoints'] # 关键点标注数量
if count > 0:
image_id = annotation['image_id']
category_id = annotation['category_id'] # 类别编号
bbox = annotation['bbox'] # 左上角x,y,w,h
# segmentation = annotation['segmentation'][0] # 分割点
keypoints = annotation['keypoints'] # 人体关键点
classname = idtoclss[category_id] # 类别名
category_id = CLASSES.index(classname) # 转下编号
file = idtoimage[image_id]
filename,h,w = file[0],file[1],file[2]
# bbox归一化
cx = (bbox[0] + bbox[2]/2) / w
cy = (bbox[1] + bbox[3]/2) / h
box_w, box_h = bbox[2]/w, bbox[3]/h
line = [str(i) for i in [category_id, cx, cy, box_w, box_h]]
line = ' '.join(line)
# keypoints归一化
x = [i/w for i in keypoints[0::3]] # x坐标归一化
y = [i/h for i in keypoints[1::3]] # y坐标归一化
v = [i for i in keypoints[2::3]] # v=0没标注,v=1被遮挡,v=2可见
xy = ''
if Dim == 2:
for i in range(len(x)):
xy += str(x[i]) + ' ' + str(y[i]) + ' '
if Dim == 3:
for i in range(len(x)):
xy += str(x[i]) + ' ' + str(y[i]) + ' ' + str(v[i]) + ' '
line = line + ' ' + xy + '\n'
outfile = filename.split('.')[0]+'.txt'
outfile = os.path.join('labels_person_keypoints',outfile)
with open(outfile,'a') as f:
f.write(line)
except:
continue