COCO2017 数据集分类统计

最新推荐文章于 2024-01-12 09:48:53 发布

liguiyuan112

最新推荐文章于 2024-01-12 09:48:53 发布

阅读量1.4w

点赞数 10

分类专栏： AI 文章标签：人工智能 coco2017 目标检测数据集

本文链接：https://blog.csdn.net/u012505617/article/details/106517073

版权

AI 专栏收录该内容

42 篇文章

订阅专栏

最近用到coco2017数据集做目标检测，顺便整理一下数据集。

coco数据集用专门的python api 方便我们直接来读取图片数据，详细的可以去看 https://github.com/cocodataset/cocoapi，

我们这里主要是统计数据集的类别，这样就清楚自己的训练数据是否足够，不同的类别分布是否均衡等问题。

我们使用以下代码来统计类别、图片数、标注框数：

from pycocotools.coco import COCO

dataDir='./COCO'
dataType='val2017'
#dataType='train2017'
annFile='{}/annotations/instances_{}.json'.format(dataDir, dataType)

# initialize COCO api for instance annotations
coco=COCO(annFile)

# display COCO categories and supercategories
cats = coco.loadCats(coco.getCatIds())
cat_nms=[cat['name'] for cat in cats]
print('number of categories: ', len(cat_nms))
print('COCO categories: \n', cat_nms)

# 统计各类的图片数量和标注框数量
for cat_name in cat_nms:
    catId = coco.getCatIds(catNms=cat_name)     # 1~90
    imgId = coco.getImgIds(catIds=catId)        # 图片的id  
    annId = coco.getAnnIds(catIds=catId)        # 标注框的id

    print("{:<15} {:<6d}     {:<10d}".format(cat_name, len(imgId), len(annId)))

测试集输出：

类别	图片数量	标注框数量
person	2693	11004
bicycle	149	316
car	535	1932
motorcycle	159	371
airplane	97	143
bus	189	285
train	157	190
truck	250	415
boat	121	430
traffic light	191	637
fire hydrant	86	101
stop sign	69	75
parking meter	37	60
bench	235	413
bird	125	440
cat	184	202
dog	177	218
horse	128	273
sheep	65	361
cow	87	380
elephant	89	255
bear	49	71
zebra	85	268
giraffe	101	232
backpack	228	371
umbrella	174	413
handbag	292	540
tie	145	254
suitcase	105	303
frisbee	84	115
skis	120	241
snowboard	49	69
sports ball	169	263
kite	91	336
baseball bat	97	146
baseball glove	100	148
skateboard	127	179
surfboard	149	269
tennis racket	167	225
bottle	379	1025
wine glass	110	343
cup	390	899
fork	155	215
knife	181	326
spoon	153	253
bowl	314	626
banana	103	379
apple	76	239
sandwich	98	177
orange	85	287
broccoli	71	316
carrot	3	2303
hot dog	0	345
pizza	153	285
donut	62	338
cake	124	316
chair	580	1791
couch	195	261
potted plant	172	343
bed	149	163
dining table	501	697
toilet	149	179
tv	207	288
laptop	183	231
mouse	88	106
remote	145	283
keyboard	106	153
cell phone	214	262
microwave	54	55
oven	115	143
toaster	8	9
sink	187	225
refrigerator	101	126
book	230	1161
clock	204	267
vase	137	277
scissors	28	36
teddy bear	0	262
hair drier	9	11
toothbrush	34	57

训练集输出：

类别	图片数量	标注框数量
person	64115	262465
bicycle	3252	7113
car	12251	43867
motorcycle	3502	8725
airplane	2986	5135
bus	3952	6069
train	3588	4571
truck	6127	9973
boat	3025	10759
traffic light	4139	12884
fire hydrant	1711	1865
stop sign	1734	1983
parking meter	705	1285
bench	5570	9838
bird	3237	10806
cat	4114	4768
dog	4385	5508
horse	2941	6587
sheep	1529	9509
cow	1968	8147
elephant	2143	5513
bear	960	1294
zebra	1916	5303
giraffe	2546	5131
backpack	5528	8720
umbrella	3968	11431
handbag	6841	12354
tie	3810	6496
suitcase	2402	6192
frisbee	2184	2682
skis	3082	6646
snowboard	1654	2685
sports ball	4262	6347
kite	2261	9076
baseball bat	2506	3276
baseball glove	2629	3747
skateboard	3476	5543
surfboard	3486	6126
tennis racket	3394	4812
bottle	8501	24342
wine glass	2533	7913
cup	9189	20650
fork	3555	5479
knife	4326	7770
spoon	3529	6165
bowl	7111	14358
banana	2243	9458
apple	1586	5851
sandwich	2365	4373
orange	1699	6399
broccoli	1939	7308
carrot	24	51719
hot dog	11	8426
pizza	3166	5821
donut	1523	7179
cake	2925	6353
chair	12774	38491
couch	4423	5779
potted plant	4452	8652
bed	3682	4192
dining table	11837	15714
toilet	3353	4157
tv	4561	5805
laptop	3524	4970
mouse	1876	2262
remote	3076	5703
keyboard	2115	2855
cell phone	4803	6434
microwave	1547	1673
oven	2877	3334
toaster	217	225
sink	4678	5610
refrigerator	2360	2637
book	5332	24715
clock	4659	6334
vase	3593	6613
scissors	947	1481
teddy bear	16	6087
hair drier	189	198
toothbrush	1007	1954