转载地址:http://blog.csdn.net/wei1033701020/article/details/53815874
caffe的大多数层是由c++写成的,借助于c++的高效性,网络可以快速训练。但是我们有时候需要自己写点输入层以应对各种不同的数据输入,比如你因为是需要在图像中取块而不想写成LMDB,这时候可以考虑使用Python直接写一个层。而且输入层不需要GPU加速,所需写起来也比较容易。
python层怎么用
先看一个网上的例子吧(来自http://chrischoy.github.io/research/caffe-python-layer/)
这里的type就只有Python一种,然后top,bottom和常见的层是一样的,module就是你的python module名字,一般就是文件名,然后layer就是定义的类的名字。
一般setup、reshape、forword、backword四个函数是必须的,其他函数按自己的需求来补充,这四个函数格式如下:
def setup(self, bottom, top)、def reshape(self, bottom, top)、def forward(self, bottom, top)
def backward(self, top, propagate_down, bottom):
def setup(self, bottom, top)、def reshape(self, bottom, top)、def forward(self, bottom, top)
def backward(self, top, propagate_down, bottom):
这里就以 Fully Convolutional Networks for Semantic Segmentation 论文中公布的代码作为示例,解释python层该怎么写。
import caffe
import numpy as np
from PIL import Image
import random
class VOCSegDataLayer(caffe.Layer):
""" Load (input image, label image) pairs from PASCAL VOC one-at-a-time while reshaping the net to preserve dimensions. Use this to feed data to a fully convolutional network. """
def setup(self, bottom, top):
""" Setup data layer according to parameters: - voc_dir: path to PASCAL VOC year dir - split: train / val / test - mean: tuple of mean values to subtract - randomize: load in random order (default: True) - seed: seed for randomization (default: None / current time) for PASCAL VOC semantic segmentation. example params = dict(voc_dir="/path/to/PASCAL/VOC2011", mean=(104.00698793, 116.66876762, 122.67891434), split="val") """
# config
params = eval(self.param_str)
self.voc_dir = params['voc_dir']
self.split = params['split']
self.mean = np.array(params['mean'])
self.random = params.get('randomize', True)
self.seed = params.get('seed', None)
# two tops: data and label
if len(top) != 2:
raise Exception("Need to define two tops: data and label.")
# data layers have no bottoms
if len(bottom) != 0:
raise Exception("Do not define a bottom.")
# load indices for images and labels
split_f = '{}/ImageSets/Segmentation/{}.txt'.format(self.voc_dir,
self.split)
self.indices = open(split_f, 'r').read().splitlines()
self.idx = 0
# make eval deterministic
if 'train' not in self.split:
self.random = False
# randomization: seed and pick
if self.random:
random.seed(self.seed)
self.idx = random.randint(0, len(self.indices)-1)
def reshape(self, bottom, top):
# load image + label image pair
self.data = self.load_image(self.indices[self.idx])
self.label = self.load_label(self.indices[self.idx])
# reshape tops to fit (leading 1 is for batch dimension)
top[0].reshape(1, *self.data.shape)
top[1].reshape(1, *self.label.shape)
def forward(self, bottom, top):
# assign output
top[0].data[...] = self.data
top[1].data[...] = self.label
# pick next input
if self.random:
self.idx = random.randint(0, len(self.indices)-1)
else:
self.idx += 1
if self.idx == len(self.indices):
self.idx = 0
def backward(self, top, propagate_down, bottom):
pass
def load_image(self, idx):
""" Load input image and preprocess for Caffe: - cast to float - switch channels RGB -> BGR - subtract mean - transpose to channel x height x width order """
im = Image.open('{}/JPEGImages/{}.jpg'.format(self.voc_dir, idx))
in_ = np.array(im, dtype=np.float32)
in_ = in_[:,:,::-1]
in_ -= self.mean
in_ = in_.transpose((2,0,1))
return in_
def load_label(self, idx):
""" Load label image as 1 x height x width integer array of label indices. The leading singleton dimension is required by the loss. """
im = Image.open('{}/SegmentationClass/{}.png'.format(self.voc_dir, idx))
label = np.array(im, dtype=np.uint8)
label = label[np.newaxis, ...]
return label
准备工作:
Compile WITH_PYTHON_LAYER option
First, you have to build Caffe with WITH_PYTHON_LAYER
option 1. Run make clean
to delete all the compiled binaries. Then,
WITH_PYTHON_LAYER=1 make && make pycaffe
If you skip this, caffe will complain that layer factory function can’t find Python layer.
layer_factory.hpp:77] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: Python
Python Layer
A gist from Evan Shelhamer summarizes the basics of the python layer.
...
layer {
type: 'Python'
name: 'loss'
top: 'loss'
bottom: 'ipx'
bottom: 'ipy'
python_param {
# the module name -- usually the filename -- that needs to be in $PYTHONPATH
module: 'pyloss'
# the layer name -- the class name in the module
layer: 'EuclideanLossLayer'
}
# set loss weight so Caffe knows this is a loss layer
loss_weight: 1
}
You have to define a python layer that is defined in your $PYTHONPATH. In the prototxt, the module
is pyloss
, which means that the file that contains the EuclideanLossLayer
should be named pyloss.py
.
import caffe
import numpy as np
class EuclideanLossLayer(caffe.Layer):
def setup(self, bottom, top):
# check input pair
if len(bottom) != 2:
raise Exception("Need two inputs to compute distance.")
def reshape(self, bottom, top):
# check input dimensions match
if bottom[0].count != bottom[1].count:
raise Exception("Inputs must have the same dimension.")
# difference is shape of inputs
self.diff = np.zeros_like(bottom[0].data, dtype=np.float32)
# loss output is scalar
top[0].reshape(1)
def forward(self, bottom, top):
self.diff[...] = bottom[0].data - bottom[1].data
top[0].data[...] = np.sum(self.diff**2) / bottom[0].num / 2.
def backward(self, top, propagate_down, bottom):
for i in range(2):
if not propagate_down[i]:
continue
if i == 0:
sign = 1
else:
sign = -1
bottom[i].diff[...] = sign * self.diff / bottom[i].num
-
https://github.com/BVLC/caffe/issues/2093 ↩