[Animal Pose Dataset]动物姿态数据集转YOLO格式

最新推荐文章于 2025-02-28 17:30:00 发布

F-821

最新推荐文章于 2025-02-28 17:30:00 发布

阅读量1.1k

点赞数 7

文章标签： YOLO 人工智能 python 数据仓库

本文链接：https://blog.csdn.net/qq_56818923/article/details/135294147

版权

数据集内容：ANIMAL-POSE 数据集主要对五大类动物进行姿态标注：狗，猫，牛，马，羊。我们总共注释了20个关键点：两只眼睛，喉咙，鼻子，两个耳底，尾巴，四个肘，四个膝盖，四个爪子。

数据集数量：

（1）关键点数据集：4000多种图像，5个类别

（2）动物检测数据集：仅带有检测框的动物数据集，共七个类别

官方下载链接：GitHub - noahcao/animal-pose-dataset

本文主要代码将原数据转换为yolo数据集（clas + box + points）格式，以及数据集使用方法：

import os
import json
from tqdm import tqdm
import argparse
from PIL import Image

parser = argparse.ArgumentParser()


#这里根据自己的json文件位置，换成自己的就行

parser.add_argument('--json_path', default='D:/work/datasetbag/test_image/keypoints.json',type=str, help="input: coco format(json)")


#这里设置.txt文件保存位置

parser.add_argument('--save_path', default='D:/work/datasetbag/test_image/animal_pose_visualized/', type=str, help="specify where to save the output dir of labels")

arg = parser.parse_args()

def convert_box(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[2]) / 2.0
    y = (box[1] + box[3]) / 2.0
    w = box[2] - box[0]
    h = box[3] - box[1]
#round函数确定(xmin, ymin, xmax, ymax)的小数位数
    x = str(round(x * dw, 6))
    w = str(round(w * dw, 6))
    y = str(round(y * dh, 6))
    h = str(round(h * dh, 6))

    return " ".join([x, y, w, h])

def convert_point(size, point):
    temp = point
    for index1 in range(len(point)):
        for index2 in range(len(point[index1])):
            if index2 != 2:
                x = min(1.000000,float(temp[index1][index2])/float(size[index2]))
                x = max(0.000000,x)
                temp[index1][index2] = str(round(x, 6))
                # 奇怪的是标记有的落在图片大小之外，所以多添加了两个比较

                # print(temp[index1][index2]) #这行注释可以查看关键点位置
            else:
                temp[index1][index2] = str(temp[index1][index2])
        temp[index1] = " ".join(temp[index1])

    return temp


if __name__ == '__main__':
    json_file =   arg.json_path # COCO Object Instance 类型的标注
    ana_txt_save_path = arg.save_path  # 保存的路径

    data = json.load(open(json_file, 'r'))
    if not os.path.exists(ana_txt_save_path):
        os.makedirs(ana_txt_save_path)

    id_map = {} 

    with open(os.path.join(ana_txt_save_path, 'classes.txt'), 'w') as f:
        # 写入classes.txt
        for i, category in enumerate(data['categories']):
            f.write(f"{category['name']}\n")
            id_map[category['id']] = i

    #这里需要根据自己的需要，更改写入图像相对路径的文件位置。
    #这个train.txt包含了训练路径，也可以不用

    list_file = open(os.path.join(ana_txt_save_path, 'train.txt'), 'w')
    for img in tqdm(data['images']):
        filename = data['images'][img]

        #file_path是图片文件夹的路径，需要先解压下载的图片
        #该数据集没有提供图片大小，所以要重新读取
        file_path = 'D:/work/datasetbag/test_image/animalpose_images'

        imge = Image.open(file_path+'/'+filename)
        img_width = imge.width  # 图片的宽
        img_height = imge.height  # 图片的高
        img_id = int(img)


        for ann in data['annotations']:
            # print(img_id)
            # print(ann)
            if ann['image_id'] == img_id:
                box = convert_box((img_width, img_height), ann["bbox"])
                points = convert_point((img_width, img_height),ann["keypoints"])
                # print(points) #查看关键点
                # print(ann["bbox"]) #查看box
                
                head, tail = os.path.splitext(filename)
                ana_txt_name = head + ".txt"  # 对应的txt名字，与jpg一致
                f_txt = open(os.path.join(ana_txt_save_path, ana_txt_name), 'w')
                f_txt.write("%s %s %s\n" % (id_map[ann["category_id"]], box , " ".join(points)))
                f_txt.close()

                list_file.write('./images/train/%s.jpg\n' %(head))
    list_file.close()

运行之后在保存的文件夹目录下面可以看到对应的txt文件，并且额外包含classes.txt和train.txt两个文本。

每个txt中包含：种类，边框，关键点（x，y，可见性），共20个关键点，处理完成。

另外的train.txt和classes.txt内容如下：

还需要后续处理的话一定记得删除这两个文件。

训练问题：

原图格式不仅有jpg还有jpeg和一张png（md很烦）。

解决方案：复制图片文件夹到你的数据库下，在存储图片的目录下新建txt，输入以下内容

改后缀为.bat，运行即可。