【学习笔记】使用Tensorflow版ICNet训练自己的数据集

最新推荐文章于 2019-11-02 09:22:49 发布

Masec

最新推荐文章于 2019-11-02 09:22:49 发布

阅读量2.5k

点赞数 4

分类专栏： TensorFlow学习记录学习笔记文章标签： ICNet Tensorflow

本文链接：https://blog.csdn.net/yourgreatfather/article/details/89154733

版权

本文详细介绍了如何在Tensorflow框架下，使用ICNet模型训练自己的数据集，包括数据集的准备、代码修改和训练过程，以及训练结果的验证。在训练过程中，由于无法获取预训练权重，作者分享了在没有预训练权重的情况下进行训练的步骤。在验证阶段，由于evaluate.py不适用于自定义数据集，作者进行了相应的代码修改。

摘要由CSDN通过智能技术生成

所用ICNet版本：hellochick-Github，star 286

目前想在Tensorflow框架下使用ICNet训练自己的数据集，发现语义分割方面好像Tensorflow框架下的“官方”代码很少，都是大牛按照原作者的论文结合原作者在caffe框架下的代码复现的……本篇博客使用的代码也不例外，是台湾省的国立清华大学的一名硕士生写的……膜拜。

这篇博客记录一下如何不使用预训练的权重训练自己的数据集（因为权重文件都要从Google driver上下载……emmm，墙好高啊）。

P.S. 看到其他博客有说这个作者的代码变化很大，我使用的是新版的代码（没有tool.py）

1.准备数据集

以我使用的数据集举例，假设我们已经完成了数据集的标注工作（使用lableme标注），我的数据集文件夹为：LMC_2433_DATASET，是一个关于变电站环境道路的自制数据集，总共包含2433张图片以及对应的label。

将图片随机分为三个部分，分别是train、val（可选）、test部分，三部分没有交叉。个人建议train划分的图片多一些，我的比例是：1569，0，864。随后将对应的标签文件（8bit灰度图）分别放在对应的annot文件夹下。然后准备对应的txt文件，txt文件的写法是：

原图地址 标签地址
原图地址 标签地址
原图地址 标签地址
...

每一行对应一对原图和label地址，地址使用绝对地址，中间用空格分割开。至此，数据集准备完毕。

2.修改源码&训练

因为无法下载预训练的权重文件，我只好挨个修改代码，做到在没有预训练权重的情况下进行训练。

针对train.py文件：

"""
This code is based on DrSleep's framework: https://github.com/DrSleep/tensorflow-deeplab-resnet 
"""
import argparse
import os
import sys
import time

import tensorflow as tf
import numpy as np

from model import ICNet_BN
from utils.config import Config
from utils.visualize import decode_labels
from utils.image_reader import ImageReader, prepare_label

def get_arguments():
    parser = argparse.ArgumentParser(description="Reproduced ICNet")
    
    parser.add_argument("--random-mirror", action="store_true",
                        help="Whether to randomly mirror the inputs during the training.")
    parser.add_argument("--random-scale", action="store_true",
                        help="Whether to randomly scale the inputs during the training.")
    parser.add_argument("--update-mean-var", action="store_true",
                        help="whether to get update_op from tf.Graphic_Keys")
    parser.add_argument("--train-beta-gamma", action="store_true",
                        help="whether to train beta & gamma in bn layer")
    parser.add_argument("--dataset", required=True,
                        help="Which dataset to trained with",
                        choices=['cityscapes', 'ade20k', 'others'])
    parser.add_argument("--filter-scale", type=int, default=2,
                        help="1 for using pruned model, while 2 for using non-pruned model.",
                        choices=[1, 2])
    return parser.parse_args()

def get_mask(gt, num_classes, ignore_label):
    less_equal_class = tf.less_equal(gt, num_classes-1)
    not_equal_ignore = tf.not_equal(gt, ignore_label)
    mask = tf.logical_and(less_equal_class, not_equal_ignore)
    indices = tf.squeeze(tf.where(mask), 1)

    return indices

def create_loss(output, label, num_classes, ignore_label):
    raw_pred = tf.reshape(output, [-1, num_classes])
    label = prepare_label(label, tf.stack(output.get_shape()[1:3]), num_classes=num_classes, one_hot=False)
    label = tf.reshape(label, [-1,])

    indices = get_mask(label, num_classes, ignore_label)
    gt = tf.cast(tf.gather(label, indices), tf.int32)
    pred = tf.gather(raw_pred, indices)

    loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=gt)
    reduced_loss = tf.reduce_mean(loss)

    return reduced_loss

def create_losses(net, label, cfg):
    # Get output from different branches
    sub4_out = net.layers['sub4_out']
    sub24_out = net.layers['sub24_out']
    sub124_out = net.layers[&#