TACO数据集下载

田土豆

已于 2022-06-10 19:56:14 修改

阅读量3.5k

点赞数 4

分类专栏：深度学习数据集文章标签：数据集

于 2021-05-05 21:43:33 首次发布

本文链接：https://blog.csdn.net/weixin_42216109/article/details/116430960

版权

深度学习数据集专栏收录该内容

28 篇文章

订阅专栏

TACO是一个用于垃圾检测的大型图像数据集，包含1500张图片和4784个box2D标注，涵盖28类和60种子类别的垃圾。数据集被分成15个batch，每个包含100张左右的图片及对应标注。使用该数据集需要引用相关论文。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Samples

Overview

TACO是一个用于垃圾检测的图像数据集，包含从热带海滩到伦敦街头在不同环境下拍摄的垃圾的照片，共有1500张图像，4784个box2Db标注。其中垃圾大类可以分为28类，分别是

'Aluminium foil', 'Battery', 'Blister pack', 'Bottle', 'Bottle cap', 'Broken glass', 'Can', 'Carton', 'Cup', 'Food waste', 'Glass jar', 'Lid', 'Other plastic', 'Paper', 'Paper bag', 'Plastic bag & wrapper', 'Plastic container', 'Plastic glooves', 'Plastic utensils', 'Pop tab', 'Rope & strings', 'Scrap metal', 'Shoe', 'Squeezable tube', 'Straw', 'Styrofoam piece', 'Unlabeled litter', 'Cigarette'

子类可以分为60种，分别是

'Aluminium foil', 'Battery', 'Aluminium blister pack', 'Carded blister pack', 'Other plastic bottle', 'Clear plastic bottle', 'Glass bottle', 'Plastic bottle cap', 'Metal bottle cap', 'Broken glass', 'Food Can', 'Aerosol', 'Drink can', 'Toilet tube', 'Other carton', 'Egg carton', 'Drink carton', 'Corrugated carton', 'Meal carton', 'Pizza box', 'Paper cup', 'Disposable plastic cup', 'Foam cup', 'Glass cup', 'Other plastic cup', 'Food waste', 'Glass jar', 'Plastic lid', 'Metal lid', 'Other plastic', 'Magazine paper', 'Tissues', 'Wrapping paper', 'Normal paper', 'Paper bag', 'Plastified paper bag', 'Plastic film', 'Six pack rings', 'Garbage bag', 'Other plastic wrapper', 'Single-use carrier bag', 'Polypropylene bag', 'Crisp packet', 'Spread tub', 'Tupperware', 'Disposable food container', 'Foam food container', 'Other plastic container', 'Plastic glooves', 'Plastic utensils', 'Pop tab', 'Rope & strings', 'Scrap metal', 'Shoe', 'Squeezable tube', 'Plastic straw', 'Paper straw', 'Styrofoam piece', 'Unlabeled litter', 'Cigarette'

Data Explore

数据集分为15个batch文件夹，每一个文件夹内含有100左右的垃圾图像以及一个annotation.json文件，文件读取后是一个字典，其键如下

dict_keys(['info', 'images', 'annotations', 'scene_annotations', 'licenses', 'categories', 'scene_categories'])

数据初始化

修改annotation.json所在的file_path路径，即可获得图片的box2D标签字典，其键、值分别为图片id，图片标签（可能含有多个标签）。键值为列表格式，其元素格式为[xmin,ymin,w,h,category]

import json

with open(file_path, encoding='utf-8') as f:    line = f.readline()    all = json.loads(line)
# 获取图片对应的idimg_id = {}for i in all["images"]:    if i["file_name"] not in img_id.keys():        img_id[i["file_name"]] = i["id"]# 获取标签id对应的类别category_id = {}for i in all["categories"]:    if i["id"] not in category_id.keys():        category_id[i["id"]] = i["name"]# 获取图片id对应的标签labels_dict = {}for i in all["annotations"]:    if i["image_id"] not in labels_dict.keys():        labels_dict[i["image_id"]] = []    bbox = i["bbox"]    cate_id = i["category_id"]    label = bbox + [category_id[cate_id]]    labels_dict[i["image_id"]].append(label)

Citation

If you use this dataset and API in a publication, please cite us using:

@article{taco2020,
    title={TACO: Trash Annotations in Context for Litter Detection},
    author={Pedro F Proença and Pedro Simões},
    journal={arXiv preprint arXiv:2003.06975},
    year={2020}
}

关注公众号，后台回复 taco 即可获得数据集