Pipeline
1. Clone the code
Refer to https://github.com/shelhamer/fcn.berkeleyvision.org
2. File Structure
- data // store the dataset
- pascal
- voc-fcn32s // store the code os voc-fcn32s
- solve.py
- voc-fcn16s
- voc-fcn8s
- voc_layers.py // self-defined Data Layer
- score.py // the code to get the accuracy on test
- surgery.py // model transfer
3. Caffe to be depended on
Since this model uses the official Caffe as the basic framework, we only need link the caffe_root of this code to the official Caffe_Root in solve.py
.
4. Make the dataset
- If we use the original VOC 2012, we only need to put the VOC2012 which is extracted from VOC2012 dataset the directory of data/pascal. The directory is as shown in the picture.
- If we use the augmented dataset from Hariharan et al. , the mount of 10582, we can refer to my another blog (https://blog.csdn.net/u014451076/article/details/79700653 or http://blog.csdn.net/Xmo_jiao/article/details/77897109) to make the dataset and put the same directory
data/pascal
. - Note: There are white boundary in the original VOC train and val ground truth, so we shouldn’t remove the color (224, 244, 192), … maybe I can’t remember clearly, when we transfer the ground truth from RGB to Gray.
5. Train
After configuring the path of caffe_root, run solve.py to train and test.
import surgery, score
import numpy as np
import os
import sys
caffe_path = '/path/to/caffe'
sys.path.append(caffe_path + os.sep + 'python')
import caffe
sys.path.append('../')
try:
import setproctitle
setproctitle.setproctitle(os.path.basename(os.getcwd()))
except:
pass
weights = 'VGG_ILSVRC_16_layers_conv.caffemodel'
# init
# caffe.set_device(int(sys.argv[1]))
caffe.set_mode_gpu()
solver = caffe.SGDSolver('solver.prototxt')
solver.net.copy_from(weights)
# surgeries
interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
surgery.interp(solver.net, interp_layers)
# scoring
val = np.loadtxt('../data/pascal/seg11valid.txt', dtype=str)
for _ in range(25):
solver.step(4000)
score.seg_tests(solver, False, val, layer='score') // test
6. Test
The test will run automatically during training. If we want to test by hand, we can comment the score.seg_tests(solver, False, val, layer='score')
in solve.py and write out own code by reference to the score.py, which is easy.
7. Result
After the training, the result will be automatically computed by the score.seg_tests
function, as below(batch is 1).
>>> 2018-01-31 11:16:09.116888 Begin seg tests
>>> 2018-01-31 11:17:38.188218 Iteration 100000 loss 50934.6016657
>>> 2018-01-31 11:17:38.188347 Iteration 100000 overall accuracy 0.910447562 321
>>> 2018-01-31 11:17:38.188383 Iteration 100000 mean accuracy 0.780200850608
>>> 2018-01-31 11:17:38.188546 Iteration 100000 mean IU 0.638823816966
>>> 2018-01-31 11:17:38.188604 Iteration 100000 fwavacc 0.845708465144
This result is similar to the official result(63.6 mIU on seg11valid) https://github.com/shelhamer/fcn.berkeleyvision.org.
Source Code
voc_layers.py
- From the voc_layers.py, we can see that the batch size is 1.
def reshape(self, bottom, top):
# load image + label image pair
self.data = self.load_image(self.indices[self.idx])
self.label = self.load_label(self.indices[self.idx])
# reshape tops to fit (leading 1 is for batch dimension)
top[0].reshape(1, *self.data.shape)
top[1].reshape(1, *self.label.shape)
When we want to set larger batch size, we can set the iter_size in solver.prototxt
- the image are not resized
If we want to resize, we can modify the voc_layers.py as show in the below figure:
- Note: There are a trick that if we don’t resize these image, we can’t modify the ptx freely and we’d better resize them.
score.py
The code is computed the accuracy, mean accuracy, miou and fw IU. And a simply modification helps to generate the accuracy log.
from __future__ import division
import caffe
import numpy as np
import os
import sys
from datetime import datetime
from PIL import Image
def fast_hist(a, b, n):
k = (a >= 0) & (a < n)
return np.bincount(n * a[k].astype(int) + b[k], minlength=n**2).reshape(n, n)
def compute_hist(net, save_dir, dataset, layer='score', gt='label'):
n_cl = net.blobs[layer].channels
if save_dir:
os.mkdir(save_dir)
hist = np.zeros((n_cl, n_cl))
loss = 0
for idx in dataset:
net.forward()
hist += fast_hist(net.blobs[gt].data[0, 0].flatten(),
net.blobs[layer].data[0].argmax(0).flatten(),
n_cl)
if save_dir:
im = Image.fromarray(net.blobs[layer].data[0].argmax(0).astype(np.uint8), mode='P')
im.save(os.path.join(save_dir, idx + '.png'))
# compute the loss as well
loss += net.blobs['loss'].data.flat[0]
return hist, loss / len(dataset)
def seg_tests(solver, save_format, dataset, layer='score', gt='label'):
f = open("accuracy.log", "a+")
f.write( '>>> ' + str(datetime.now()) + ' Begin seg tests\n')
f.close()
solver.test_nets[0].share_with(solver.net)
do_seg_tests(solver.test_nets[0], solver.iter, save_format, dataset, layer, gt)
def do_seg_tests(net, iter, save_format, dataset, layer='score', gt='label'):
n_cl = net.blobs[layer].channels
if save_format:
save_format = save_format.format(iter)
hist, loss = compute_hist(net, save_format, dataset, layer, gt)
f = open("accuracy.log", "a+")
# mean loss
f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' loss ' + str(loss) + '\n')
# overall accuracy
acc = np.diag(hist).sum() / hist.sum()
f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' overall accuracy ' + str(acc) + '\n')
# per-class accuracy
acc = np.diag(hist) / hist.sum(1)
f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' mean accuracy ' + str(np.nanmean(acc)) + '\n')
# per-class IU
iu = np.diag(hist) / (hist.sum(1) + hist.sum(0) - np.diag(hist))
f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' mean IU ' + str(np.nanmean(iu)) + '\n')
freq = hist.sum(1) / hist.sum()
f.write( '>>> ' + str(datetime.now()) + ' Iteration ' + str(iter) + ' fwavacc ' +\
str((freq[freq > 0] * iu[freq > 0]).sum()) + '\n')
f.close()
return hist
surgery.py
Two main functions
- Interpolation for upscore layer
def interp(net, layers)
- model transplant from vgg16 to fully convolution(Not be used in official code)
def transplant(new_net, net, suffix='')