1.使用caffe的python接口创建一个网络Net,写入为mnist_train.prototxt
from caffe import layers as L,params as P,to_proto
train_lmdb = '/media/mlxuan/LinuxH/project/SegNet/caffe-master/examples/mnist/mnist_train_lmdb'
def create_net(lmdb,batch_size,include_acc = False):
#创建第一层,数据层 传入两类数据:图片数据和标签 其中L.Data的参数参考caffe.proto的DataParameter,backend=P.Data.LMDB,P.Data.LMDB也是参考DataParameter,transform_param = dict(xx)参考参TransformationParameter
data,label = L.Data(source = lmdb,backend = P.Data.LMDB,batch_size = batch_size,ntop = 2,transform_param = dict(crop_size = 40,mirror=True))
#创建第二层,卷积层 L可以创建的层参考LayerParameter
conv1 = L.Convolution(data,kernel_size = 5,stride = 1,num_output = 16,pad = 2,weight_filler = dict(type = 'xavier'))
#创建激活函数层
relu1 = L.ReLu(conv1,in_place = True)
#创建池化层
pool1=L.Pooling(relu1, pool=P.Pooling.MAX, kernel_size=3, stride=2)
conv2=L.Convolution(pool1, kernel_size=3, stride=1,num_output=32, pad=1,weight_filler=dict(type='xavier'))
relu2=L.ReLU(conv2, in_place=True)
pool2=L.Pooling(relu2, pool=P.Pooling.MAX, kernel_size=3, stride=2)
#创建一个全连接层
fc3=L.InnerProduct(pool2, num_output=1024,weight_filler=dict(type='xavier'))
relu3=L.ReLU(fc3, in_place=True)
#创建一个dropout层
drop3 = L.Dropout(relu3, in_place=True)
fc4 = L.InnerProduct(drop3, num_output=10,weight_filler=dict(type='xavier'))
#创建一个softmax层
loss = L.SoftmaxWithLoss(fc4, label)
return to_proto(loss)
def write_net():
#将以上的设置写入到prototxt文件
with open('mnist_train.prototxt','w') as f:
f.write(str(create_net(train_lmdb,batch_size = 64)))
write_net()
关键点:
1.问:L.Data,L.Convolution,L.SoftmaxWithLoss......,L有哪些层,在哪里看:
在caffe.proto的message LayerParameter中查看:
message LayerParameter {
......
optional AccuracyParameter accuracy_param = 102;
optional ArgMaxParameter argmax_param = 103;
optional BatchNormParameter batch_norm_param = 139;
optional BiasParameter bias_param = 141;
optional ClipParameter clip_param = 148;
optional ConcatParameter concat_param = 104;
.......
}
2.问:L.Data等各个层的参数是什么,在哪里看
在caffe.proto对应的DataParameter,ConvolutionParameter等中查看。
message DataParameter {
enum DB {
LEVELDB = 0;
LMDB = 1;
}
optional uint32 batch_size = 4;
optional DB backend = 8
2.L.Layer的返回值是什么?
以L.Pooling为例,需要看python/caffe/net_spec.py文件和src/caffe/layers/pooling_layer.cpp文件。
pool1,mask1 = L.Pooling(relu1, pool2=P.Pooling.MAX, kernel_size=3, stride=2,ntop=2)
pool1 = L.Pooling(relu1, pool2=P.Pooling.MAX, kernel_size=3, stride=2)
都可以运行通过。
python/caffe/net_spec.py文件中,定义了L.Pooling函数
def layer_fn(*args, **kwargs):
fn = Function(name, args, kwargs)
if fn.ntop == 0:
return fn
elif fn.ntop == 1:
return fn.tops[0]
else:
return fn.tops
其中,Function()的定义:
class Function(object):
"""A Function specifies a layer, its parameters, and its inputs (which
are Tops from other layers)."""
def __init__(self, type_name, inputs, params):
self.type_name = type_name
for index, input in enumerate(inputs):
if not isinstance(input, Top):
raise TypeError('%s input %d is not a Top (type is %s)' %
(type_name, index, type(input)))
self.inputs = inputs
self.params = params
self.ntop = self.params.get('ntop', 1)
# use del to make sure kwargs are not double-processed as layer params
if 'ntop' in self.params:
del self.params['ntop']
self.in_place = self.params.get('in_place', False)
if 'in_place' in self.params:
del self.params['in_place']
self.tops = tuple(Top(self, n) for n in range(self.ntop))
从上面两段代码克制,self.tops就是返回的值,返回几个值取决与我们设置的ntop(即num of top)
返回的值的含义,需要看src/caffe/layers/pooling_layer.cpp
返回的第一个值在pooling_layer.cpp中的定义为top[0],为对输入层做pool后的结果
返回的第二个值在pooling_layer.cpp中的定义为top[1],为poolMask,即所取的值的位置
3.如何给L.Layer两个输入
根据上面layer_fn的定义,def layer_fn(*args, **kwargs): fn = Function(name, args, kwargs),非字典型参数给了args, 字典型参数都给了kwargs。所以直接MaxUps1 = L.Upsample(pool1,mask1,scale = 2)就表示输入的数据有pool1和mask1.
在源代码中MaxUpsample.cpp中,这两个输入分别用bottom[0]和bottom[1]表示
4.如何分别设置卷积层中weight和bias的学习率
参考https://www.cnblogs.com/houjun/p/9910413.html
设置param,前面的参数表示weight的学习率,衰减率;后面的参数表示bias的学习率和衰减率
net.conv1 = caffe.layers.Convolution(
net.data,
param=[{"lr_mult": 1, "decay_mult": 1}, {"lr_mult": 2, "decay_mult": 1}], #lr_mult: 学习率的系数,最终的学习率是这个数乘以solver.prototxt配置文件中的base_lr。
#如果有两个lr_mult, 则第一个表示权值的学习率,第二个表示偏置项的学习率。一般偏置项的学习率是权值学习率的两倍。
name="Conv1",
kernel_size=3,
stride=1,
pad=1,
num_output=20,
group=2,
weight_filler=dict(type='xavier'),
bias_filler=dict(type='constant',value=0))
对应.prototxt的写法为
layer {
name: "Convolution22"
type: "Convolution"
bottom: "Scale21"
top: "Convolution22"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 1.0
}
convolution_param {
}
}
}
4.prototxt网络模型的可视化:
https://ethereon.github.io/netscope/#/editor
未完待续