李沐-46 语义分割和数据集【动手学深度学习v2】

最新推荐文章于 2024-07-22 20:09:49 发布

Cai_CS_stu

最新推荐文章于 2024-07-22 20:09:49 发布

阅读量280

点赞数 7

分类专栏：李沐-动手深度学习文章标签：深度学习人工智能

本文链接：https://blog.csdn.net/ADDDDDDS/article/details/138423653

版权

李沐-动手深度学习专栏收录该内容

8 篇文章 0 订阅

订阅专栏

在语义分割中，不是一张图片分配一个label，而是为图片的每一个像素点分配一个label。假设我们输入的是RGB三通道的图片，即每个像素点颜色可以表示为(x, y, z)，那么为了给像素点打上label，我们需要构建一个映射关系，例如下面，我们定义了21种颜色对应的标签信息：

#@save
VOC_COLORMAP = [[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0],
                [0, 0, 128], [128, 0, 128], [0, 128, 128], [128, 128, 128],
                [64, 0, 0], [192, 0, 0], [64, 128, 0], [192, 128, 0],
                [64, 0, 128], [192, 0, 128], [64, 128, 128], [192, 128, 128],
                [0, 64, 0], [128, 64, 0], [0, 192, 0], [128, 192, 0],
                [0, 64, 128]]

#@save
VOC_CLASSES = ['background', 'aeroplane', 'bicycle', 'bird', 'boat',
               'bottle', 'bus', 'car', 'cat', 'chair', 'cow',
               'diningtable', 'dog', 'horse', 'motorbike', 'person',
               'potted plant', 'sheep', 'sofa', 'train', 'tv/monitor']

那么怎么查找标签中每个像素的类索引呢？我们定义了voc_colormap2label函数来构建从上述RGB颜色值到类别索引的映射，而voc_label_indices函数将RGB值映射到在Pascal VOC2012数据集中的类别索引。两个函数定义如下：

#@save
def voc_colormap2label():
    """构建从RGB到VOC类别索引的映射"""
    colormap2label = torch.zeros(256 ** 3, dtype=torch.long)
    for i, colormap in enumerate(VOC_COLORMAP):
        colormap2label[
            (colormap[0] * 256 + colormap[1]) * 256 + colormap[2]] = i
    return colormap2label

#@save
def voc_label_indices(colormap, colormap2label):
    """将VOC标签中的RGB值映射到它们的类别索引"""
    colormap = colormap.permute(1, 2, 0).numpy().astype('int32')
    idx = ((colormap[:, :, 0] * 256 + colormap[:, :, 1]) * 256
           + colormap[:, :, 2])
    return colormap2label[idx]

下面对函数做出解释：

voc_colormap2label()函数中，可以将图片的三个通道值通过下式转化为唯一的索引，R、G、B为对应通道值：

再将该索引对应的值设置为类别号，形成一个可供查询的哈希表。

voc_label_indices()函数中，传入的colormap为三通道图片，举个例子是(400, 500, 3)，先经过permute()函数将通道维放到最后，再计算出idx，注意每个像素点都有一个label，因此idx的形状为400×500，返回的colormap2label[idx]形状也是400×500，上面的值为对应VOC_CLASSES中的类别号