deeplab_v3 实现制作并训练自己的数据集——个人采坑

最新推荐文章于 2024-05-03 14:38:54 发布

酸辣土豆丝不要辣

最新推荐文章于 2024-05-03 14:38:54 发布

阅读量7.6k

点赞数 8

分类专栏：深度学习-语义分割文章标签： deeplab v3 实现训练自己数据集

本文链接：https://blog.csdn.net/xjtdw/article/details/92848032

版权

深度学习-语义分割专栏收录该内容

6 篇文章 1 订阅

订阅专栏

deeplab_v3制作并训练自己的数据集过程

一、源码连接
二、环境测试
- - 我设置的ubuntu默认python为 python==3.5，配置的环境也是基于python3.5，python2不知是否有错。
  - - 1）测试文件一：
    - 2）测试文件二：
三、数据集制作
四、训练模型
- - - - **PS：这里要注意两点**
五、预测
六、deeplab模型转化为 .pd格式---frozen to .pd
七、deeplabv3使用.pd模型进行预测 inference步骤
八、预测结果
九、总结

一、源码连接

https://github.com/tensorflow/models.git
https://github.com/tensorflow/models/tree/master/research/deeplab
网上其他博客中一般给出的是这两个连接，细心观察一下，其实两个连接，是一个git项目，只是包含与被包含关系。这里训练的 deeplab_v3就是在 /tensorflow/models 中的 /research/deeplab中。

二、环境测试

我设置的ubuntu默认python为 python==3.5，配置的环境也是基于python3.5，python2不知是否有错。

因为我之前用ubuntu系统跑过好多其他网络，所以并没有出现安装其他第三方库的问题，但是tensorflow网络版本不能太低，我记得我的是tf==1.8会报错，升级后就可以了。当然在测试时，还遇到一些其他问题，详情见下。

1）测试文件一：

到…/research 目录下，执行： python deeplab/model_test.py
PS->可能会报错：
from nets.mobilenet import mobilenet_v2
ImportError: No module named ‘nets’
解决方案：
先执行如下语句，添加环境变量即可
export PYTHONPATH=$PYTHONPATH:/path-to/models/research/slim

2）测试文件二：

还是在…/research 目录下，执行 sh local_test.sh
这里可以打开 local_test.sh 看下里面内容，里面有一个模型和一个数据集需要在线下载，下载很慢，可以复制链接到浏览器下载快些。具体的存放位置，去
local_test.sh中查看。
local_test.sh 中包括数据集制作，模型训练，计算miou,和预测，这个脚本能运行成功，说明环境没有问题。

三、数据集制作

在此连接中：
https://github.com/tensorflow/models/tree/master/research/deeplab
查看readme.md有如下内容：
Installation.
Running DeepLab on PASCAL VOC 2012 semantic segmentation dataset.
Running DeepLab on Cityscapes semantic segmentation dataset.
Running DeepLab on ADE20K semantic segmentation dataset.

所以一般数据集的制作一般是按照PASCAL VOC或者Cityscapes目录格式来制作。

我是按照PASCAL VOC目录格式制作，但是也不必一板一眼的去遵循PASCAL VOC的tree型目录制作，具体过程如下：

1）制作数据集需要三个文件夹：

images labels train_val_list
其中，
images文件夹中存放rgb三通道原图，一般为 .jpg格式；
labels文件夹存放 8bit 单通道灰度图，即标注文件，好多数据集labels提供的为
rgb 8bit 的.png，所以需要自己转化为单通道；
train_val_list文件夹中，存放 train.txt, val.txt, train_val.txt 三个txt文件，其中存放的为图片名称（注意，不带有扩展名），具体 train.txt和val.txt分配比例可自己决定。train_val.txt中，存放的是全部图片名称

2）转为 tfrecords格式

在…research/目录下，执行

python  deeplab/datasets/build_voc2012_data.py \
--image_folder="path/to/你的images文件夹" \
--semantic_segmentation_folder="path/to/你的labels文件夹" \
--list_folder="path/to/你的train_val_list文件夹" \
--image_format="jpg" \
--label_format="png" \
--output_dir="path/to/生成的tfrecords数据集保存位置"

“–image_format=“jpg” ，–label_format=“png” \”这两个参数，根据你的图片格式设定。
数据制作完成后，会生成如下图文件列表：在这里插入图片描述

3）修改脚本文件

a、
在…/research/deeplab/datasets中找到 data_generator.py 按照特定格式，加入自己的数据集种类。比如我的是mapillary数据集，train有15000张，val有3000张，加上背景共 66类，就写为：

_MAPILLARY_VISTAS_INFORMATION = DatasetDescriptor(
        splits_to_sizes={
        'train': 15000,  # num of samples in images/training
        'val': 3000,  # num of samples in images/validation
        'trainval': 18000,
    },
    num_classes = 67, #这里需要再加1，我还不知道为啥
    ignore_label=255,  #为忽略类别，不参加训练，注意不是背景类
)

有的博文说要更改 datasets下segmentation_dataset.py 和 utils下train_utils.py，我目前感觉没啥用…
b、
修改 deeplab\ utils\ train_utils.py，在１６０行左右，改成如下形式：

#Variables that will not be restored.
  #exclude_list = ['global_step']
  exclude_list = ['global_step','logits']

c、修改train.py　１３５行左右改成如下形式，这是为了加载模型时，最后一层重头训练，因为我们的分类和其deeplab的不同。

#Set to False if one does not want to re-use the trained classifier weights.

    flags.DEFINE_boolean('initialize_last_layer', False,
                         'Initialize the last layer.')
    
    flags.DEFINE_boolean('last_layers_contain_logits_only', True,
                         'Only consider logits as last layers or not.')

这里是修改原则，看你是用预训练模型的全部权重，还是只训练最后一层权重，还是全部都重新训练。

四、训练模型

上面做好了数据集，就可以训练模型了
在…research/目录下，执行：

python deeplab/train.py \
--logtostderr \
--training_number_of_steps=50000 \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18  \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=385,1025 \
--train_batch_size=4 \
--dataset="mapillary_vistas" \
--num_clones=3 \
--tf_initial_checkpoint='path/to/预训练模型位置' \
--train_logdir='path/to/保存训练出的模型位置' \
--dataset_dir='path/to/数据集位置'

参数介绍:
–training_number_of_steps=50000 #迭代次数
–train_split=“train” #选取的数据集
–model_variant=“xception_65” #选取的预训练模型
–train_crop_size=385,1025 #要截取的图片大小，一般要是你的数据集图片为
（hight,width）,那么这里就为（hight+1，width+1），千万别搞反了。
–train_batch_size=4 #batch_size我的最大设为4，再大了，会报内存溢出错误
–dataset=“mapillary_vistas” #这是你自己设置的数据集的名称
–num_clones=2 #GPU使用数量，可设置为1,2，4，若为3会报错
–tf_initial_checkpoint=“…/research/deeplab/datasets/pascal_voc_seg/init_models/deeplabv3_pascal_train_aug/model.ckpt” #预模型位置

PS：这里要注意两点

1、有的博文参数输入为
–train_crop_size=512
–train_crop_size=512
是分开写的，这样会报如下错误：

crop_width=self.crop_size[1],  IndexError: list index out of range

参数输入写为
–train_crop_size=‘513,513’ 即可解决

2、当 --train_crop_size=‘513,513’ 参数输入时，切记为

--train_crop_size=‘height, 	width’

切记勿写反了，这里写反了，也不会报错，但是会影响最后训练出来的模型效果

五、预测

当训练好了模型后，就开始预测了

python deeplab/vis.py \
--logtostderr \
--vis_split="val" \   #这里注意
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \    #这里改变会报错
--dataset="mapillary_vistas" \
--vis_crop_size="385,1025" \    #切记勿搞反
--checkpoint_dir="path/to/训练好的模型保存路径" \
--vis_logdir="path/to/输出的预测图片保存路径" \
--dataset_dir="path/to/数据集路径" \
--max_number_of_iterations=1

PS：
1)–vis_crop_size=“385,1025” 若这里搞反，预测出来的图像，尺寸会比畸形

同理，若在执行 python deeplab/eval.py 时，搞反了会报如下错误：

assertion failed: [`predictions` out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [28 28 28...] [y (mean_iou/ToInt64_1:0) = ] [21]

2）可能遇到报错：

	Invalid argument: padded_shape[1]=69 is not divisible by block_shape[1]=2

解决方案：
deeplab/input_preprocess.py中
找到

if is_training and label is not None

然后添加：

else:
  rr = tf.minimum(tf.cast(crop_height,tf.float32)/tf.cast(image_height,tf.float32),\
           tf.cast(crop_width,tf.float32)/tf.cast(image_width,tf.float32))
   newh = tf.cast(tf.cast(image_height, tf.float32)*rr, tf.float32)
   neww = tf.cast((tf.cast(image_width, tf.float32)*rr), tf.float32)
   processed_image = tf.image.resize_images(
       processed_image, (newh, neww), method=tf.image.ResizeMethod.BILINEAR, align_corners=True)
   processed_image = preprocess_utils.pad_to_bounding_box(
                       processed_image, 0, 0, crop_height, crop_width, mean_pixel)

参考链接:
https://github.com/tensorflow/models/issues/3695

六、deeplab模型转化为 .pd格式—frozen to .pd

一般为了方便，会将模型，转化为 .pd格式，执行export_model.py：

python deeplab/export_model.py \
  --logtostderr \
  --checkpoint_path=".../deeplab_model/model.ckpt-48214" \
  --export_path=".../frozen_pb/deeplab_1024x384_66.pb" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --num_classes=** \
  --crop_size=height \
  --crop_size=width \
  --inference_scales=1.0

参数说明：
–checkpoint_path="…/…/model.ckpt-48214" #这里注意一下，并不是模型的全名，而是写到指定迭代次数位置即可。比如，我保存的模型为：
在这里插入图片描述
–export_path="…/…/frozen_pb/deeplab_1024x384_66.pb" #模型存放位置
–num_classes=** #你训练时候设置的类别数

七、deeplabv3使用.pd模型进行预测 inference步骤

参考链接：
https://blog.csdn.net/pnan222/article/details/88801933
https://zhuanlan.zhihu.com/p/50506840

import tensorflow as tf
import numpy as np
import cv2 as cv
import os
from keras.preprocessing.image import load_img, img_to_array
from matplotlib import pyplot as plt


img_path = "..."  # 原图存放文件夹路径
graph_path = ".../.../.pb"  # .pd模型路径
pre_path = "..."  #  预测出的图片存放位置

graph = tf.Graph()
INPUT_TENSOR_NAME = 'ImageTensor:0'
OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'
graph_def = None
with tf.gfile.FastGFile(graph_path, 'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())

if graph_def is None:
    raise RuntimeError('Cannot find inference graph in tar archive.')

with graph.as_default():
    tf.import_graph_def(graph_def, name='')

sess = tf.Session(graph=graph)

for filename in os.listdir(img_path):
    prename = filename[0:-4] + ".png"   #预测输出保存为 .png格式
    file_path = ori_path + "/" + filename
    save_path = pre_path + '/' + prename
    img = load_img(file_path)
    img = img_to_array(img)
    img = np.expand_dims(img, axis=0).astype(np.uint8)

    result = sess.run(
        OUTPUT_TENSOR_NAME,
        feed_dict={INPUT_TENSOR_NAME: img})

    cv.imwrite(save_path, result.transpose((1, 2, 0)))

八、预测结果

我是拿 Mapillary数据集训练的，加背景66类，再加 ignore-label = 255 共67类，我看到一些博客说不设置ignore-label可能会报错，但是没有深究原因。下面三张图分别为原图，label，预测结果。
在这里插入图片描述

九、总结

暂时就遇到了这些坑，以后会继续补充。
大家看博文时，多找几个对比着看，毕竟写的不够细致，有好多地方还得完善。

训练指令（个人笔记，无需参考）：

python deeplab/train.py \
    --logtostderr \
    --training_number_of_steps=500 \
    --train_split="train" \
    --model_variant="xception_65" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18  \
	--output_stride=16 \
    --decoder_output_stride=4 \
    --train_crop_size=385,1025 \
    --train_batch_size=4 \
    --dataset="mapillary_vistas" \
    --num_clones=1 \
    --tf_initial_checkpoint='/media/cv/DataA/deeplab_v3+/research/deeplab/datasets/pascal_voc_seg/init_models/deeplabv3_pascal_train_aug/model.ckpt' \
    --train_logdir='/media/cv/DataA/deeplab_v3+/deeplab_model_512x512' \
    --dataset_dir='/media/cv/DataA/Data_ThirdParty/mapillary-vistas-dataset_public_v1.1/validation/deeplab_dataset/tfrecords'



python deeplab/eval.py \
  --logtostderr \
  --eval_split="val" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --dataset="mapillary_vistas" \
  --eval_crop_size="385,1025" \
  --checkpoint_dir="/media/cv/DataA/deeplab_v3+/deeplab_model_1024x384_2" \
  --eval_logdir="/media/cv/DataA/deeplab_v3+/eval" \
  --dataset_dir="/media/cv/DataA/Data_ThirdParty/mapillary-vistas-dataset_public_v1.1/training/deep_lab_dataset/tfrecords_1024x384" \
  --max_number_of_evaluations=1
  
  
  
python deeplab/vis.py \
  --logtostderr \
  --vis_split="trainval" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --dataset="mapillary_vistas" \
  --vis_crop_size="385,1025" \
  --checkpoint_dir="/media/cv/DataA/deeplab_v3+/deeplab_model_384x1024" \
  --vis_logdir="/media/cv/DataA/deeplab_v3+/vis" \
  --dataset_dir="/media/cv/DataA/Data_ThirdParty/mapillary-vistas-dataset_public_v1.1/validation/deeplab_dataset/tfrecords_1024x384" \
  --max_number_of_iterations=1
  
  
  /media/cv/DataA/Data_ThirdParty/mapillary-vistas-dataset_public_v1.1/validation/deeplab_dataset/tfrecords
  /media/cv/DataA/Data_ThirdParty/mapillary-vistas-dataset_public_v1.1/training/deep_lab_dataset/tfrecords_1024x384


python deeplab/export_model.py \
  --logtostderr \
  --checkpoint_path="/media/cv/DataA/deeplab_v3+/deeplab_model_384x1024/model.ckpt-48214" \
  --export_path="/media/cv/DataA/deeplab_v3+/frozen_pb/deeplab_1024x384_66.pb" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --num_classes=66 \
  --crop_size=385 \
  --crop_size=1025 \
  --inference_scales=1.0


	
python deeplab/datasets/build_voc2012_data.py \
  --image_folder="/media/cv/DataA/Data_ThirdParty/mapillary-vistas-dataset_public_v1.1/validation/images_1024x384" \
  --semantic_segmentation_folder="/media/cv/DataA/Data_ThirdParty/mapillary-vistas-dataset_public_v1.1/validation/labels_gray_8bit_1024x384" \
  --list_folder="/media/cv/DataA/Data_ThirdParty/mapillary-vistas-dataset_public_v1.1/validation/deeplab_dataset/flie_list" \
  --image_format="jpg" \
  --label_format="png" \
  --output_dir="/media/cv/DataA/Data_ThirdParty/mapillary-vistas-dataset_public_v1.1/validation/deeplab_dataset/tfrecords_1024x384"

酸辣土豆丝不要辣

关注

8
点赞
踩
47

收藏

觉得还不错? 一键收藏
35
评论
deeplab_v3 实现制作并训练自己的数据集——个人采坑

deeplab_v3制作并训练自己的数据集过程一、源码连接二、环境测试我设置的ubuntu默认python为 python==3.5，配置的环境也是基于python3.5，python2不知是否有错。1）测试文件一：2）测试文件二：三、数据集制作我是按照PASCAL VOC目录格式制作，但是也不必一板一眼的去遵循PASCAL VOC的tree型目录制作，具体过程如下：**1）制作数据集需要三个文件...
复制链接

扫一扫