FCN论文以及源码拆分详解(一)

FCN论文以及源码拆分详解(一)

FCN 论文:Fully Convolutional Networks for Semantic Segmentation
参考github代码。

摘要:
开山之作-----state of-the-art segmentation
Our key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet [20], the VGG net [31], and GoogLeNet [32]) into fully convolutional networks and transfer their learned representations by fine-tuning [3] to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations.

一数据预处理

1. dataset.py

内容组成:LabelProcessor和 Dataset类
1.1 LabelProcessor
作用: 对图像标签进行编码,----背景知识:哈希函数

    def __init__(self, file_path):

        self.colormap = self.read_color_map(file_path)

        self.cm2lbl = self.encode_label_pix(self.colormap)

read_color_map: 实现读取标签rgb值,输出形式 [[128 128 128],[],…]
“name,r,g,b
Sky,128, 128, 128
Building,128, 0, 0
Pole,192, 192, 128
Road,128, 64, 128
Sidewalk,0,0,192
Tree,128,128,0
SignSymbol,192,128,128
Fence,64,64,128
Car,64,0,128
Pedestrian,64,64,0
Bicyclist,0,128,192
unlabelled,0,0,0

*encode_label_pix:标签编码,返回哈希表----作用:*为了快速查找对应标签

cm2lbl[(cm[0] * 256 + cm[1]) * 256 + cm[2]] = i

计算举例: 128256+ 128256+64 ==>0 ==> sky 类别
*encode_label_img: 查表

    def encode_label_img(self, img):

        data = np.array(img, dtype='int32')
        idx = (data[:, :, 0] * 256 + data[:, :, 1]) * 256 + data[:, :, 2]
        return np.array(self.cm2lbl[idx], dtype='int64')

导包:

import pandas as pd
import os
import torch as t
import numpy as np
import torchvision.transforms.functional as ff
from torch.utils.data import Dataset
from PIL import Image
import torchvision.transforms as transforms
import cfg

1.2 Dataset类

    def __init__(self, file_path=[], crop_size=None):
        # 1 正确读入图片和标签路径
        if len(file_path) != 2:
            raise ValueError("同时需要图片和标签文件夹的路径")
        self.img_path = file_path[0]
        self.label_path = file_path[1]
        # 2 取出图片和标签数据的文件名
        self.imgs = self.read_file(self.img_path)
        self.labels = self.read_file(self.label_path)
        # 3 初始化数据处理函数设置
        self.crop_size = crop_size
    def __getitem__(self, index):
        img = self.imgs[index]
        label = self.labels[index]
        # 格式转换,有备无患
        img = Image.open(img)
        label = Image.open(label).convert('RGB')

        img, label = self.center_crop(img, label, self.crop_size)

        img, label = self.img_transform(img, label)
        # print('处理后的图片和标签大小:',img.shape, label.shape)
        sample = {'img': img, 'label': label}

        return sample

函数: read_file(self, path): -----获取路径下文件的绝对/相对路径。
函数:img_transform(self, img, label):
“”“对图片和标签做一些数值处理”"" 归一化,并转为pytorch的tensor形式。

调试:
给定指定路径TRAIN_ROOT,TRAIN_LABEL,val和test、crop_size,实例化类。

模型构建见下一篇~。

如有错误请指正!

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值