从本篇开始,我们来记录一下全卷积网络用来做语义分割的全过程。
代码:https://github.com/shelhamer/fcn.berkeleyvision.org
下面我们将描述三方面的内容:
1. 官方提供的公开数据集
2. 自己的数据集如何准备,主要是如何标注label
3. 训练结束后如何对结果着色。
公开数据集
这里分别说一下SiftFlowDataset与pascal voc数据集。
1. pascal voc
根据FCN代码中的data文件夹下的pascal说明:
# PASCAL VOC and SBD
PASCAL VOC is a standard recognition dataset and benchmark with detection and semantic segmentation challenges.
The semantic segmentation challenge annotates 20 object classes and background.
The Semantic Boundary Dataset (SBD) is a further annotation of the PASCAL VOC data that provides more semantic segmentation and instance segmentation masks.
PASCAL VOC has a private test set and [leaderboard for semantic segmentation](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6).
The train/val/test splits of PASCAL VOC segmentation challenge and SBD diverge.
Most notably VOC 2011 segval intersects with SBD train.
Care must be taken for proper evaluation by excluding images from the train or val splits.
We train on the 8,498 images of SBD train.
We validate on the non-intersecting set defined in the included `seg11valid.txt`.
Refer to `classes.txt` for the listing of classes in model output order.
Refer to `../voc_layers.py` for the Python data layer for this dataset.
See the dataset sites for download:
- PASCAL VOC 2012: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/
- SBD: see [homepage](http://home.bharathh.info/home/sbd) or [direct download](http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz)
我们可以下载训练数据集:SBD 以及测试集:PASCAL VOC 2012
然后进入fcn/data,新建sbdd文件夹(如果没有),将benchmark的dataset解压到sbdd中,将VOC2012解压到data下的pascal文件夹下。 这两个文件夹已经准备好了train.txt用于训练,seg11valid.txt用于测试。
2. SIFT-Flow
下载数据集:下载地址。
并解压至/fcn.berkeleyvision.org/data/下,并覆盖名为sift-flow的文件夹。
由于FCN源代码已经为我们准备好了train.txt等文件了,所以不需要重新生成。
准备自己的数据集
深度学习图像分割(FCN)训练自己的模型大致可以以下三步:
1.为自己的数据制作label;
2.将自己的数据分为train,val和test集;
3.仿照voc_lyaers.py编写自己的输入数据层。
在FCN中,图像的大小是不限的,此时如果数据集的图片大小不一,则每次只能训一张图片。这是FCN代码的默认设置。即batch_size=1.但是如果批量训练,则应该要求所有的数据集大小相同。此时我们需要使用resize进行缩放。一般情况下,我们将原图缩放到256*256,或者500*500.
1. 缩放图像
下面给出几个缩放函数,来自网上:http://blog.csdn.net/u010402786/article/details/72883421
(1)单张图片的resize
import Image
def convert(width,height):
im = Image.open("C:\\xxx\\test.jpg")
out = im.resize((width, height),Image.ANTIALIAS)
out.save("C:\\xxx\\test.jpg")
if __name__ == '__main__':
convert(256,256)
(2)resize整个文件夹里的图片
import Image
import os
def convert(dir,width,height):
file_list = os.listdir(dir)
print(file_list)
for filename in file_list:
path = ''
path = dir+filename
im = Image.open(path)
out = im.resize((256,256),Image.ANTIALIAS)
print "%s has been resized!"%filename
out.save(path)
if __name__ == '__main__':
dir = raw_input('please input the operate dir:')
convert(dir,256,256)
(3)按比例resize
import Image
def convert(width,height):
im = Image.open("C:\\workspace\\PythonLearn1\\test_1.jpg")
(x, y)= im.size
x_s = width
y_s = y * x_s / x
out = im.resize((x_s, y_s), Image.ANTIALIAS)
out.save("C:\\workspace\\PythonLearn1\\test_1_out.jpg")
if __name__ == '__main__':
convert(256,256)
图像标签制作
第一步:使用github开源软件进行标注
地址:https://github.com/wkentaro/labelme
Usage
Annotation
Run labelme --help
for detail.
labelme # Open GUI
labelme static/apc2016_obj3.jpg # Specify file
labelme static/apc2016_obj3.jpg -O static/apc2016_obj3.json # Close window after the save
The annotations are saved as a JSON file. The
file includes the image itself.
Visualization
To view the json file quickly, you can use utility script:
labelme_draw_json static/apc2016_obj3.json
Convert to Dataset
To convert the json to set of image and label, you can run following:
labelme_json_to_dataset static/apc2016_