caffe中Python层的使用

最新推荐文章于 2024-07-26 09:28:41 发布

PolukeAdria

最新推荐文章于 2024-07-26 09:28:41 发布

阅读量5.7k

点赞数 2

分类专栏： caffe

本文链接：https://blog.csdn.net/wei1033701020/article/details/53815874

版权

caffe 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

caffe的大多数层是由c++写成的，借助于c++的高效性，网络可以快速训练。但是我们有时候需要自己写点输入层以应对各种不同的数据输入，比如你因为是需要在图像中取块而不想写成LMDB，这时候可以考虑使用python直接写一个层。而且输入层不需要GPU加速，所需写起来也比较容易。

python层怎么用

先看一个网上的例子吧（来自http://chrischoy.github.io/research/caffe-python-layer/）

layer {
  type: 'Python'
  name: 'loss'
  top: 'loss'
  bottom: 'ipx'
  bottom: 'ipy'
  python_param {
    # the module name -- usually the filename -- that needs to be in $PYTHONPATH
    module: 'pyloss'
    # the layer name -- the class name in the module
    layer: 'EuclideanLossLayer'
  }
  # set loss weight so Caffe knows this is a loss layer
  loss_weight: 1
}

这里的type就只有Python一种，然后top，bottom和常见的层是一样的，module就是你的python module名字，一般就是文件名，然后layer就是定义的类的名字。

一般setup、reshape、forword、backword四个函数是必须的，其他函数按自己的需求来补充，这四个函数格式如下：

def setup(self, bottom, top)、def reshape(self, bottom, top)、def forward(self, bottom, top)

def backward(self, top, propagate_down, bottom):

这里就以 Fully Convolutional Networks for Semantic Segmentation 论文中公布的代码作为示例，解释python层该怎么写。

import caffe

import numpy as np
from PIL import Image

import random

class VOCSegDataLayer(caffe.Layer):
    """ Load (input image, label image) pairs from PASCAL VOC one-at-a-time while reshaping the net to preserve dimensions. Use this to feed data to a fully convolutional network. """

    def setup(self, bottom, top):
        """ Setup data layer according to parameters: - voc_dir: path to PASCAL VOC year dir - split: train / val / test - mean: tuple of mean values to subtract - randomize: load in random order (default: True) - seed: seed for randomization (default: None / current time) for PASCAL VOC semantic segmentation. example params = dict(voc_dir="/path/to/PASCAL/VOC2011", mean=(104.00698793, 116.66876762, 122.67891434), split="val") """
        # config
        params = eval(self.param_str)
        self.voc_dir = params['voc_dir']
        self.split = params['split']
        self.mean = np.array(params['mean'])
        self.random = params.get('randomize', True)
        self.seed = params.get('seed', None)

        # two tops: data and label
        if len(top) != 2:
            raise Exception("Need to define two tops: data and label.")
        # data layers have no bottoms
        if len(bottom) != 0:
            raise Exception("Do not define a bottom.")

        # load indices for images and labels
        split_f  = '{}/ImageSets/Segmentation/{}.txt'.format(self.voc_dir,
                self.split)
        self.indices = open(split_f, 'r').read().splitlines()
        self.idx = 0

        # make eval deterministic
        if 'train' not in self.split:
            self.random = False

        # randomization: seed and pick
        if self.random:
            random.seed(self.seed)
            self.idx = random.randint(0, len(self.indices)-1)


    def reshape(self, bottom, top):
        # load image + label image pair
        self.data = self.load_image(self.indices[self.idx])
        self.label = self.load_label(self.indices[self.idx])
        # reshape tops to fit (leading 1 is for batch dimension)
        top[0].reshape(1, *self.data.shape)
        top[1].reshape(1, *self.label.shape)


    def forward(self, bottom, top):
        # assign output
        top[0].data[...] = self.data
        top[1].data[...] = self.label

        # pick next input
        if self.random:
            self.idx = random.randint(0, len(self.indices)-1)
        else:
            self.idx += 1
            if self.idx == len(self.indices):
                self.idx = 0


    def backward(self, top, propagate_down, bottom):
        pass


    def load_image(self, idx):
        """ Load input image and preprocess for Caffe: - cast to float - switch channels RGB -> BGR - subtract mean - transpose to channel x height x width order """
        im = Image.open('{}/JPEGImages/{}.jpg'.format(self.voc_dir, idx))
        in_ = np.array(im, dtype=np.float32)
        in_ = in_[:,:,::-1]
        in_ -= self.mean
        in_ = in_.transpose((2,0,1))
        return in_


    def load_label(self, idx):
        """ Load label image as 1 x height x width integer array of label indices. The leading singleton dimension is required by the loss. """
        im = Image.open('{}/SegmentationClass/{}.png'.format(self.voc_dir, idx))
        label = np.array(im, dtype=np.uint8)
        label = label[np.newaxis, ...]
        return label