自建数据集系列：cocoJson数据集统计分析

最新推荐文章于 2024-01-18 17:12:12 发布

星空•物语

最新推荐文章于 2024-01-18 17:12:12 发布

阅读量879

点赞数

分类专栏： # 自建AI数据集文章标签： coco统计分析

本文链接：https://blog.csdn.net/qq_40265247/article/details/122070521

版权

自建AI数据集专栏收录该内容

9 篇文章 11 订阅

订阅专栏

前言

在这里插入图片描述
之前文章中放了这么一张统计的图，虽然给出了每个类别对应的实例数，一看就是及其不均衡，不利于分类。但这里实例的图片分布并没有，不利于后续的不均衡扩增。

实现

cocoJsonStat.py

import json
from unicodedata import category
import tqdm
import os

json_file = "COD10K_CAM_coco/annotations/instances_train2017.json"

cnt_dict = {}
with open(json_file) as f:
        data = json.load(f)

        # Create image dict
        images = {'%g' % x['id']: x for x in data['images']}

        categories = {'%g' % x['id']: x for x in data['categories']}

        for x in data['categories']:
            if not x['supercategory'] in cnt_dict.keys():
                cnt_dict[x['supercategory']] = {x["name"]:{"class_id":x["id"],"cnt":0,"imgs":[]}}
            else:
                cnt_dict[x['supercategory']][x["name"]] = {"class_id":x["id"],"cnt":0,"imgs":[]}

        

        # Write labels file
        for x in data['annotations']:
            if x['iscrowd']:
                print("啥")
                continue
            
            cate = categories['%g' % x['category_id']]
            img = images['%g' % x['image_id']]

            cur_obj = cnt_dict[cate['supercategory']][cate["name"]]

            cur_obj["cnt"] += 1
            cur_obj["imgs"].append(img["file_name"])

        print(cnt_dict)

save_json_path = os.path.join("./", "%s.json" % "stat_CAM_coco_train.json")
json.dump(cnt_dict, open(save_json_path, 'w'), indent=4)

在这里插入图片描述
点到为止，之后的数据处理操作，大伙就各抒己见吧

星空•物语

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
打赏
0
评论
自建数据集系列：cocoJson数据集统计分析

前言之前文章中放了这么一张统计的图，虽然给出了每个类别对应的实例数，一看就是及其不均衡，不利于分类。但这里实例的图片分布并没有，不利于后续的不均衡扩增。实现cocoJsonStat.pyimport jsonfrom unicodedata import categoryimport tqdmimport osjson_file = "COD10K_CAM_coco/annotations/instances_train2017.json"cnt_dict = {}with ope
复制链接

扫一扫