The Setup of FCN

最新推荐文章于 2024-05-27 18:54:38 发布

明天去哪

最新推荐文章于 2024-05-27 18:54:38 发布

阅读量373

点赞数

分类专栏：计算机视觉代码图像语义分割代码文章标签： FCN Segmentation

本文链接：https://blog.csdn.net/u014451076/article/details/79203117

版权

图像语义分割代码同时被 2 个专栏收录

8 篇文章 8 订阅

订阅专栏

计算机视觉代码

7 篇文章 0 订阅

订阅专栏

Pipeline

1. Clone the code

Refer to https://github.com/shelhamer/fcn.berkeleyvision.org

2. File Structure

- data  // store the dataset
    - pascal
- voc-fcn32s  // store the code os voc-fcn32s
    - solve.py
- voc-fcn16s
- voc-fcn8s
- voc_layers.py  // self-defined Data Layer
- score.py  // the code to get the accuracy on test
- surgery.py  // model transfer

3. Caffe to be depended on

Since this model uses the official Caffe as the basic framework, we only need link the caffe_root of this code to the official Caffe_Root in solve.py.

4. Make the dataset

If we use the original VOC 2012, we only need to put the VOC2012 which is extracted from VOC2012 dataset the directory of data/pascal. The directory is as shown in the picture.
If we use the augmented dataset from Hariharan et al. , the mount of 10582, we can refer to my another blog (https://blog.csdn.net/u014451076/article/details/79700653 or http://blog.csdn.net/Xmo_jiao/article/details/77897109) to make the dataset and put the same directory data/pascal.
Note: There are white boundary in the original VOC train and val ground truth, so we shouldn’t remove the color (224, 244, 192), … maybe I can’t remember clearly, when we transfer the ground truth from RGB to Gray.

5. Train

After configuring the path of caffe_root, run solve.py to train and test.

import surgery, score

import numpy as np
import os
import sys
caffe_path = '/path/to/caffe'
sys.path.append(caffe_path + os.sep + 'python')
import caffe
sys.path.append('../')

try:
    import setproctitle
    setproctitle.setproctitle(os.path.basename(os.getcwd()))
except:
    pass

weights = 'VGG_ILSVRC_16_layers_conv.caffemodel'

# init
# caffe.set_device(int(sys.argv[1]))
caffe.set_mode_gpu()

solver = caffe.SGDSolver('solver.prototxt')
solver.net.copy_from(weights)

# surgeries
interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
surgery.interp(solver.net, interp_layers)

# scoring
val = np.loadtxt('../data/pascal/seg11valid.txt', dtype=str)

for _ in range(25):
    solver.step(4000)
    score.seg_tests(solver, False, val, layer='score')  // test

6. Test

The test will run automatically during training. If we want to test by hand, we can comment the score.seg_tests(solver, False, val, layer='score') in solve.py and write out own code by reference to the score.py, which is easy.

7. Result

After the training, the result will be automatically computed by the score.seg_tests function, as below(batch is 1).

>>> 2018-01-31 11:16:09.116888 Begin seg tests
>>> 2018-01-31 11:17:38.188218 Iteration 100000 loss 50934.6016657
>>> 2018-01-31 11:17:38.188347 Iteration 100000 overall accuracy 0.910447562    321
>>> 2018-01-31 11:17:38.188383 Iteration 100000 mean accuracy 0.780200850608
>>> 2018-01-31 11:17:38.188546 Iteration 100000 mean IU 0.638823816966
>>> 2018-01-31 11:17:38.188604 Iteration 100000 fwavacc 0.845708465144

This result is similar to the official result(63.6 mIU on seg11valid) https://github.com/shelhamer/fcn.berkeleyvision.org.

Source Code

voc_layers.py

From the voc_layers.py, we can see that the batch size is 1.

    def reshape(self, bottom, top):
        # load image + label image pair
        self.data = self.load_image(self.indices[self.idx])
        self.label = self.load_label(self.indices[self.idx])
        # reshape tops to fit (leading 1 is for batch dimension)
        top[0].reshape(1, *self.data.shape)
        top[1].reshape(1, *self.label.shape)

When we want to set larger batch size, we can set the iter_size in solver.prototxt

the image are not resized
If we want to resize, we can modify the voc_layers.py as show in the below figure:
Note: There are a trick that if we don’t resize these image, we can’t modify the ptx freely and we’d better resize them.

score.py

The code is computed the accuracy, mean accuracy, miou and fw IU. And a simply modification helps to generate the accuracy log.

from __future__ import division
import caffe
import numpy as np
import os
import sys
from datetime import datetime
from PIL import Image

def fast_hist(a, b, n):
    k = (a >= 0) & (a < n)
    return np.bincount(n * a[k].astype(int) + b[k], minlength=n**2).reshape(n, n)

def compute_hist(net, save_dir, dataset, layer='score', gt='label'):
    n_cl = net.blobs[layer].channels
    if save_dir:
        os.mkdir(save_dir)
    hist = np.zeros((n_cl, n_cl))
    loss = 0
    for idx in dataset:
        net.forward()
        hist += fast_hist(net.blobs[gt].data[0, 0].flatten(),
                                net.blobs[layer].data[0].argmax(0).flatten(),
                                n_cl)

        if save_dir:
            im = Image.fromarray(net.blobs[layer].data[0].argmax(0).astype(np.uint8), mode='P')
            im.save(os.path.join(save_dir, idx + '.png'))
        # compute the loss as well
        loss += net.blobs['loss'].data.flat[0]
    return hist, loss / len(dataset)

def seg_tests(solver, save_format, dataset, layer='score', gt='label'):
    f = open("accuracy.log", "a+")  
    f.write( '>>> ' + str(datetime.now()) + ' Begin seg tests\n')
    f.close()
    solver.test_nets[0].share_with(solver.net)
    do_seg_tests(solver.test_nets[0], solver.iter, save_format, dataset, layer, gt)

def do_seg_tests(net, iter, save_format, dataset, layer='score', gt='label'):
    n_cl = net.blobs[layer].channels
    if save_format:
        save_format = save_format.format(iter)
    hist, loss = compute_hist(net, save_format, dataset, layer, gt)
    f = open("accuracy.log", "a+")
    # mean loss
    f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' loss ' + str(loss) + '\n')
    # overall accuracy
    acc = np.diag(hist).sum() / hist.sum()
    f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' overall accuracy ' + str(acc) + '\n')
    # per-class accuracy
    acc = np.diag(hist) / hist.sum(1)
    f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' mean accuracy ' + str(np.nanmean(acc)) + '\n')
    # per-class IU
    iu = np.diag(hist) / (hist.sum(1) + hist.sum(0) - np.diag(hist))
    f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' mean IU ' + str(np.nanmean(iu)) + '\n')
    freq = hist.sum(1) / hist.sum()
    f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' fwavacc ' +\
            str((freq[freq > 0] * iu[freq > 0]).sum()) + '\n')
    f.close()
    return hist

surgery.py

Two main functions

Interpolation for upscore layer
def interp(net, layers)
model transplant from vgg16 to fully convolution(Not be used in official code)
def transplant(new_net, net, suffix='')