PASCAL VOC 2012数据集介绍

最新推荐文章于 2024-07-07 14:55:39 发布

一只tobey

最新推荐文章于 2024-07-07 14:55:39 发布

阅读量1.1w

点赞数 5

分类专栏：数据集

本文链接：https://blog.csdn.net/zz2230633069/article/details/84769339

版权

数据集专栏收录该内容

9 篇文章 5 订阅

订阅专栏

数据集下载在百度云盘：链接：https://pan.baidu.com/s/1FTjY-ISsDMu0vIypAQyDpg 提取码：fyxt

云盘里面有3个文件夹：VOC2012, VOC2012_test，SBD.tgz（表示SBD数据集，关于SBD数据集参考https://blog.csdn.net/zz2230633069/article/details/89335205）

补充介绍在http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html（官方）

和https://blog.csdn.net/u013832707/article/details/80060327

在VOC2012文件夹解压之后，语义分割方面的所关系的文件夹是

JPEGImage文件夹（包含了分割所要用的原图片），SegmentationClass文件夹（里面包含了分割所要用的标签图），SegmentationClass_aug文件夹（里面包含了分割所要用的标签图,融合了SBD数据集的扩充集），ImageSets文件夹下的Segmentation文件夹（里面包含了所需图片的图片名字的集合TXT文件）

JPEGImage文件夹：包含了所有的原图片总共17125张且shape=h x w x 3，mode=RGB，format=JPEG，大小不一致,像素范围是0~255.

SegmentationClass文件夹：包含了语义分割的所有标签图2913张，是处理前的标签图，shape=h x w x 3 ， mode=P ， format=PNG ，大小不一致，像素值就是下面给的彩色的RGB相对应的像素值，但是里面有其他的值比如有的边缘像素值是224x224x192.。

SegmentationClass_aug文件夹：包含了所有的语义标签图，处理过后的标签图，是灰度图，总共12031张，shape=h x w，mode=L，像素值范围就是标签值（从0～20共21类，背景是0）,处理过程很简单，初始化一张全0的图，如果该位置的像素点是物体对应的RGB值，那么该位置就为该类的标签值。

ImageSets/Segmentation/train.txt：总共有1464行也就是1464张训练图片的名字

ImageSets/Segmentation/val.txt：总共有1449行也就是1449张验证图片的名字

ImageSets/Segmentation/trainval.txt：总共有2913行也就是2913张训练验证图片，上面两个的并集

ImageSets/Segmentation/train_aug.txt = voc_trian + sbd_train - 重复的图片

总共有8829行也就是8829张训练验证图片

ImageSets/Segmentation/train_aug_val.txt = voc_val - sbd_train（就是剔除掉已经是trian_aug里面的图片）

总共有904行也就是904张训练验证图片

ImageSets/Segmentation/val_aug.txt = voc_val + sbd_val - 重复的图片 - train_aug

总共有3202行也就是3202张训练验证图片

所以：采用官方数据集就是train.txt和val.txt，采用增强数据集就是train_aug.txt和val_aug.txt。原图全部直接来自JPEGImage，标签图全部来自SegmentationClass_aug

总共20类如下：

Person: person
Animal: bird, cat, cow, dog, horse, sheep
Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

分布如下：

Below are training examples for the segmentation taster, each consisting of:

下面是类别与颜色的对应关系：一张标签图片总共有22种数字（0-20,255）其中0和255的颜色都是黑色RGB=（0,0,0），所以语义图总共有21种颜色，20个类别+黑色

the training image
the object segmentation
pixel indices correspond to the first, second, third object etc.
the class segmentation
pixel indices correspond to classes in alphabetical order (0=background, 1=aeroplane, 2=bicycle, 3=bird, 4=boat, 5=bottle, 6=bus, 7=car , 8=cat, 9=chair, 10=cow, 11=diningtable, 12=dog, 13=horse, 14=motorbike, 15=person, 16=potted plant, 17=sheep, 18=sofa, 19=train, 20=tv/monitor, 255='void' or unlabelled)
For both types of segmentation image, index 0 corresponds to background and index 255 corresponds to 'void' or unlabelled.

一只tobey

关注

5
点赞
踩
43

收藏

觉得还不错? 一键收藏
1
评论
PASCAL VOC 2012数据集介绍

数据集下载在百度云盘：链接：https://pan.baidu.com/s/1FTjY-ISsDMu0vIypAQyDpg 提取码：fyxt云盘里面有3个文件夹：VOC2012, VOC2012_test，SBD.tgz（表示SBD数据集，关于SBD数据集参考https://blog.csdn.net/zz2230633069/article/details/89335205）补充介...
复制链接

扫一扫

专栏目录