本文提供了一种从Market-1501-attribute数据集中提取性别属性的方法,并给出了实现代码。
目录
1、数据集介绍
Market-1501注释了27个属性。原始数据集包含751个用于训练的身份和750个用于测试的身份。属性在身份级别中进行注释,因此文件包含用于训练的28 x 751个属性和用于测试的28 x 750个属性,其中标签“image_index”表示身份。
数据集样式如下:
各个属性的介绍如下:
attribute | representation in file | label |
---|---|---|
gender | gender | male(1), female(2) |
hair length | hair | short hair(1), long hair(2) |
sleeve length | up | long sleeve(1), short sleeve(2) |
length of lower-body clothing | down | long lower body clothing(1), short(2) |
type of lower-body clothing | clothes | dress(1), pants(2) |
wearing hat | hat | no(1), yes(2) |
carrying backpack | backpack | no(1), yes(2) |
carrying bag | bag | no(1), yes(2) |
carrying handbag | handbag | no(1), yes(2) |
age | age | young(1), teenager(2), adult(3), old(4) |
8 color of upper-body clothing | upblack, upwhite, upred, uppurple, upyellow, upgray, upblue, upgreen | no(1), yes(2) |
9 color of lower-body clothing | downblack, downwhite, downpink, downpurple, downyellow, downgray, downblue, downgreen,downbrown | no(1), yes(2) |
在我们提供的下载链接中可以获取标注信息,其中每一条标注都包含了文件名、各个属性,形如:
Market-1501/market1501/bounding_box_test/0001_c1s1_001051_03.jpg 1 0 0 0 0 1 1 1 0 1 2 2
其中,倒数第三个是我们需要的性别信息。
2、代码实现
通过以下代码,可以从数据集中解析出每个图片的属性,并根据性别存放到不同文件夹:
"""
parse person's gender from Market-1501-attribute
"""
import glob
import os
import shutil
def parse_market1501(labels_path, imgs_path, new_dataset_path):
classes = ['0_Female', '1_Male']
os.makedirs(new_dataset_path, exist_ok=True)
Male_path = os.path.join(new_dataset_path, '1_Male')
Female_path = os.path.join(new_dataset_path, '0_Female')
os.makedirs(Male_path, exist_ok=True)
os.makedirs(Female_path, exist_ok=True)
labels = sorted(glob.glob(labels_path + '/*label_final.txt'))
for label in labels:
print("processing{}: ".format(label))
with open(label, 'r') as f:
lines = f.readlines()
for line in lines:
line_split = line.replace('\n', '').split(' ')
img_path = os.path.join(imgs_path, line_split[0].split('Market-1501/market1501/')[-1])
cls_id = 0 if int(float(line_split[-3])) == 1 else 1 # 0 for Male and 1 for Female
class_name = classes[cls_id]
shutil.copy(img_path, os.path.join(new_dataset_path, class_name, 'market1501_'+line_split[0].split('/')[-1]))
if __name__ == '__main__':
labels_path = "../market1501-attribute"
imgs_path = labels_path + '/Market-1501-v15.09.15'
new_dataset_path = 'new_dataset_path/from_Market1501'
parse_market1501(labels_path=labels_path, imgs_path=imgs_path, new_dataset_path=new_dataset_path)
附:
链接: https://pan.baidu.com/s/12C3tIMjfZai-0jj5eroXww?pwd=a8wb 提取码: a8wb
参考:
GitHub - vana77/Market-1501_Attribute: 27 hand-annotated attributes of Market-1501