Caffe2 - (十四) 网络构建API之 brew

最新推荐文章于 2020-09-24 09:45:02 发布

AIHGF

最新推荐文章于 2020-09-24 09:45:02 发布

阅读量1.7k

点赞数 1

分类专栏： Caffe2 Caffe2 文章标签： Caffe2 brew

本文链接：https://blog.csdn.net/zziahgf/article/details/79167414

版权

Caffe2 同时被 2 个专栏收录

37 篇文章 2 订阅

订阅专栏

Caffe2

37 篇文章 45 订阅

订阅专栏

Caffe2 - Brew Models

brew 是 Caffe2 用于构建模型的新的 API.

之前是 CNNModelHelper来构建模型.

但 Caffe2 不仅擅长 CNNs，还能够提供更加通用的ModelHelperobject.

新的 ModelHelper与CNNModelHelper.brew具有很多相同的功能，使得其易于构建模型.

brew 的设计思路是：ModelHelper class 只包含网络定义和参数初始化

与 CNNModelHelper同时进行模型存储与模型构建相比，ModelHelper+brew 的模型构建方式更易于模块化，易于扩展.

python/brew_test.py 给出了 brew更多的详细用例.

1. 概念 - Ops vs Helper Functions

Caffe2 是通过 operators 来构建深度学习网络的.

一般情况下，operators 是 C++ 实现. Caffe2 也提供了 Python API 包装 C++ operators. 比较灵活.

Caffe2 中，operators 一般采用 CamelCase fashion(驼峰式拼写)；而 Python helper 函数是小写的相似名字.

1.1 Ops

Caffe2 中，一般 operators 简写为 Op 或 Ops.

例如，FC Op 表示 Fully-Connected operator，即对于网络前一层和后一层的每个 neuron 都有权重连接.

FC Op 的创建：

model.net.FC([blob_in, weights, bias], blob_out)

Copy Op 的创建：

model.net.Copy(blob_in, blob_out)

Caffe2 提供的 Ops 可参见 Operators Catalog.

1.2 Helper Functions

仅仅使用单个 operators 构建网络是比较麻烦的，因为需要手工进行参数初始化、device 选择.

例如，创建一个 FC 层，需要几行代码来初始化 weight 和 bias，然后再送入 Op：

model = model_helper.ModelHelper(name="train")
# 初始化权重
weight = model.param_init_net.XavierFill([],
                                         blob_out + '_w',
                                         shape=[dim_out, dim_in],
                                         **kwargs, # 包括 GPU 指定等参数
                                        )
# 初始化 bias
bias = model.param_init_net.ConstantFill([],
                                         blob_out + '_b',
                                         shape=[dim_out, ],
                                         **kwargs, )
# 构建 FC 层
model.net.FC([blob_in, weights, bias], blob_out, **kwargs)

而以上这种方法是比较繁琐的.

Caffe2 helper functions 提供了更好的就处理方式.

Helper 函数是用于创建模型完整网络层的 wraper 函数. helper functions 可用于处理参数初始化(parameter ininialization)、operator 定义(operator definition) 及设备选择(engine selection).

Caffe2 helper functions 默认是以 Python PEP8 语法命名的. 例如，使用 python/helpers/fc.py，其通过 helper function fc实现 FC Op 是很简单的：

fcLayer = fc(model, blob_in, blob_out, **kwargs) # returns a blob reference

有些 helper functions 可以构建不止一个 operator. 如 python/rnn_cell.py 的 LSTM 函数可以在网络中创建整个 LSTM 单元.

Caffe2 提供了一些很好的 helper functions —— ‘caffe2/python/helpers/‘.

2. Brew

brew是 helper functions 的智能组合. 可以使用 brew的单个 import模块来使用 Caffe2 的全部 helper functions.

例如，添加 FC 层：

from caffe2.python import brew
brew.fc(model, blob_in, blob_out, ...)

相比于直接使用 helper function，可能是一样的. 但，当网络模型比较复杂时，brew开始真正体现出优势.

例如，LeNet 模型构建 —— MNIST Tutorial.

from caffe2.python import brew

def AddLeNetModel(model, data):
    conv1 = brew.conv(model, data, 'conv1', 1, 20, 5)
    pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2, stride=2)
    conv2 = brew.conv(model, pool1, 'conv2', 20, 50, 5)
    pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2)
    fc3 = brew.fc(model, pool2, 'fc3', 50 * 4 * 4, 500)
    fc3 = brew.relu(model, fc3, fc3)
    pred = brew.fc(model, fc3, 'pred', 500, 10)
    softmax = brew.softmax(model, pred, 'softmax')

采用 brew来创建每一网络层，依次来使用相应的 operator 来实例化各 Op.

2.1 arg_scope

arg_scope是语法糖(syntax sugar)，用于设置 helper function 的默认参数值.

例如，ResNet-150 训练脚本需要采用不同的权重初始化，直接方法是：

# 修改权重初始化 weight_init
brew.conv(model, ..., weight_init=('XavierFill', {}),...)
...
# 重复 150 次
...
brew.conv(model, ..., weight_init=('XavierFill', {}),...)

或者采用 arg_scope：

with brew.arg_scope([brew.conv], weight_init=('XavierFill', {})):
     brew.conv(model, ...) # 不需要权重初始化设置 weight_init
     brew.conv(model, ...)
     ...

2.2 自定义 Helper Function

如果遇到 brew 没有实现的 Op，需要自定义 helper functions 时，可以将自定义的 helper function 先注册到 brew，再进行 brew的统一管理和 syntax sugar.

自定义新的 helper function；
采用 .Register 函数将自定义 helper function 注册到 brew；
通过 brew.new_helper_function 来调用自定义 helper function.

def my_super_layer(model, blob_in, blob_out, **kwargs):
"""
   100x faster, awesome code that you'll share one day.
"""

brew.Register(my_super_layer)
brew.my_super_layer(model, blob_in, blob_out)