无论是出于要通过ensemble提升性能的目的,还是要设计特殊作用的网络,在用Caffe做工程时,将若干个已经train好的模型融合都是一个常见的步骤。
1、制作融合后模型的网络定义
给不同模型每层加前缀,并将每层的学习率置0,只训练融合的全连接层。
示例代码:
import sys
import re
layer_name_regex = re.compile('name:\s*"(.*?)"')
lr_mult_regex = re.compile('lr_mult:\s*\d+\.*\d*')
input_filepath = sys.argv[1]
output_filepath = sys.argv[2]
prefix = sys.argv[3]
with open(input_filepath, 'r') as fr, open(output_filepath, 'w') as fw:
prototxt = fr.read()
layer_names = set(layer_name_regex.findall(prototxt))
for layer_name in layer_names:
prototxt = prototxt.replace(layer_name, '{}/{}'.format(prefix, layer_name))
lr_mult_statements = set(lr_mult_regex.findall(prototxt))
for lr_mult_statement in lr_mult_statements:
prototxt = prototxt.replace(lr_mult_statement, 'lr_mult: 0')
fw.write(prototxt)
2、分别读取每个模型的权重并生成融合模型的权重
思路就是用pycaffe进行读取,然后按照层名字的对应关系进行值拷贝,最后再存一下就可以,代码如下:
import sys
sys.path.append('/path/to/caffe/python')
import caffe
fusion_net = caffe.Net('lenet_fusion_train_val.prototxt', caffe.TEST)
model_list = [
('even', 'lenet_even_train_val.prototxt', 'mnist_lenet_even_iter_30000.caffemodel'),
('odd', 'lenet_odd_train_val.prototxt', 'mnist_lenet_odd_iter_30000.caffemodel')
]
for prefix, model_def, model_weight in model_list:
net = caffe.Net(model_def, model_weight, caffe.TEST)
for layer_name, param in net.params.iteritems():
n_params = len(param)
try:
for i in range(n_params):
net.params['{}/{}'.format(prefix, layer_name)][i].data[...] = param[i].data[...]
except Exception as e:
print(e)
fusion_net.save('init_fusion.caffemodel')
然后就可以train啦~