caffe学习笔记1
Blobs、Layers、Nets:caffe模型的基本组成部分
深度学习模型是用来处理大量数据的,是天生的组合模型。caffe 用自己的方式 defines a net layer-by-layer ,模型自底向上而下定义(从 input data layer to loss layer)。
- Blobs:是caffe中信息传递的媒介,blobs是标准数组和内存接口。数据的forward和backward传播,存储,交换都是通过Blobs
- layers:是网络模型和计算的基础
- **Solving: **is configured separately to decouple modeling and optimization. 单独配置建模优化
接下来会详细讲解这几个方面
1Blobs的存储和通信
blob是caffe中实际数据being processed(处理) and passed along的载体
提供cpu和gpu之间数据同步的功能
Caffe stores and communicates data using blobs. Blobs provide a unified memory interface( 统一存储接口) holding data; e.g., batches of images, model parameters, and derivatives for optimization(衍生物优化)
blob分为data blob 和 Parameter blob
data blob:The conventional blob dimensions(常规维度) for batches of image data are number
N x channel K x height H x width W
Parameter blob: dimensions vary(维度随着层配置的不同而不同) according (随着层配置的不同而不同) to the type and configuration of the layer. For a convolution layer with 96 filters of 11 x 11 spatial dimension and 3 inputs the blob is 96 x 3 x 11 x 11. For an inner product / fully-connected layer with 1000 output channels and 1024 input channels the parameter blob is 1000 x 1024.
自定义数据层:For custom data it may be necessary to hack your own input preparation tool or data layer. However once your data is in your job is done. The modularity of layers accomplishes the rest of the work for you. 可能要自定义数据层,定义完很方便使用。
blob细节
我们对数据和梯度感兴趣,所以blob存储了两块内容,数据和网路计算出的梯度
As we are often interested in the values as well as the gradients of the blob, a Blob stores two chunks of memories, data and diff. The former is the normal data that we pass along, and the latter is the gradient computed by the network.
数据可以同时存储在CPU和GPU,我们有两种不同的方式来访问他们,1是常规的方法,不改变它的值,二是可变值的方法(通过地址传递)。
const Dtype* cpu_data() const;
Dtype* mutable_cpu_data();
这和GPU上一样
2、layer的计算和连接
层是模型的根本和计算的基本单元,
layers包括convolve filters,pool,take inner products,进行非线性转换(sigmoid,rule…),正则化,load data,计算loss。详情请看层的—— 层的种类
layer从bottom blob获取输入,进过层处理后送给top blob
每个层包含三个重要的部分:setup,forward,backward
Setup:在模型初始化的时候初始化层和它的连接关系
Forward:从input blob计算output blob
Backward:Backward: given the gradient w.r.t. the top output compute the gradient w.r.t. to the input and send to the bottom. A layer with parameters computes the gradient w.r.t. to its parameters and stores it internally. 制定反向传播规则
一般来说,forward和backward都会有GPU和CPU两个版本,假如没有设定GPU方案,会执行CPU方案作为备选方案,这样可以帮助你快速实验,不过在数据传输上会浪费时间。
3、定义和操作网络
网络将层定义的过程连接。net将每一层的输出组和在一起执行给定的任务,同时也组和每一层反向传播的结果。caffe是端到端的机器学习工具。
net是由层组成的计算图。
4、模型格式
.prototxt:The models are defined in plaintext protocol buffer schema (prototxt)
模型定义在prototxt文件中
.caffemodel:while the learned models are serialized as binary protocol buffer (binaryproto) .caffemodel files.
学习好的模型存放在caffemodel文件中
caffe模型的格式在 caffe.proto中定义,源码很好读,推荐自己读。
Caffe speaks Google Protocol Buffer for the following strengths: minimal-size binary strings when serialized, efficient serialization, a human-readable text format compatible with the binary version, and efficient interface implementations in multiple languages, most notably C++ and Python. This all contributes to the flexibility and extensibility of modeling in Caffe.