一、冻结层 (即固定某层参数在训练的时候不变)
https://keras.io/getting-started/faq/#how-can-i-freeze-keras-layers
1.1方法:
x = Dense(100,activation='relu',name='dense_100',trainable=False)(inputs)
或者
model.trainable = False
1.2冻结操作的经验总结:
1、冻结操作在训练时候对权重影响实验:
1) 不冻结:
# ■■■■■■■■ [2]模型设计 ■■■■■■■■
####### 主模型 #######
inputs = Input(shape=(784,))
x = Dense(100,activation='relu',name='dense_100')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)
####### 主模型 #######
model_main.load_weights('my_model_weights.h5')
# model_main.trainable = False
model = model_main
# ****** 权重探针 ********
a = model.get_weights()
print('dense_100那层的权重:', a[1]) # 'dense_100'那层的权重。
# ****** 权重探针 ********
【结果】
>>> dense_100那层的权重:
[ 0.00609367 0.01774433 0.00127991 0.01685369 -0.00588948 0.0022781
0.00694803 0.00636634 -0.00108383 -0.00480387 0.01123319 0.01685128
0.0071973 0.00373418 0.0015275 -0.0011526 -0.00451979 -0.00653248
0.01192301 -0.00078739 -0.00056679 -0.00057205 0.0220937 -0.00158271
-0.00026968 -0.00664996 -0.00085808 -0.00305471 0.00620055 0.0064344
-0.00938795 0.00266371 0.00623808 0.0083605 -0.00238177 -0.00048903
0.00059158 0.00824707 0.00500612 0.00873516 -0.0032067 0.00337419
0.01087511 0.004928 0.01195703 0.01690748 0.01420193 -0.0064415
0.00545023 0.01340502 -0.00258121 0.01323839 0.00632899 0.01284719
0.00555667 0.01261076 -0.00088008 0.01200596 0.00733639 0.01783392
-0.00440101 0.00118115 0.01178464 0.0074486 0.00896501 0.00357948
0.00705922 0.00520497 0.01415215 -0.00202574 0.00927804 0.0138014
0.0098721 0.0129296 0.00189565 0.01651774 0.00946718 -0.00534614
0.00506906 -0.00030766 -0.00026362 0.00419401 0.00212149 -0.00304823
-0.00427098 0.0041138 0.01505729 0.00112592 -0.00334759 0.00820872
-0.01345768 -0.00101386 -0.00698254 0.02179425 0.00819413 0.00404393
-0.00315165 0.01334981 0.01426365 0.00202925]
2) 冻结之后:并且将探针放在model.fit之后
# ■■■■■■■■ [2]模型设计 ■■■■■■■■
####### 主模型 ######
inputs = Input(shape=(784,))
x = Dense(100,activation='relu',name='dense_100')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)
####### 主模型 ######
model_main.load_weights('my_model_weights.h5')
model_main.trainable = False
model = model_main
# ■■■■■■■■ [3]模型编译 ■■■■■■■■
# 定义优化器
sgd = SGD(lr=0.2)
# 编译,loss function,训练过程中计算准确率
model.compile(optimizer = sgd,
loss = 'mse',
metrics=['accuracy'],
)
# ■■■■■■■■ [4]训练模型 ■■■■■■■■
model.fit(x_train,y_train,batch_size=256,epochs=1) # 主模型训练用这个
# ****** 权重探针 ********
a = model.get_weights()
print('冻结训练,训练之后,dense_100那层的权重:', a[1]) # 'dense_100'那层的权重。
# ****** 权重探针 ********
【结果】
>>>冻结训练,训练之后,dense_100那层的权重:(发现没有变化!!)
[ 0.00609367 0.01774433 0.00127991 0.01685369 -0.00588948 0.0022781
0.00694803 0.00636634 -0.00108383 -0.00480387 0.01123319 0.01685128
0.0071973 0.00373418 0.0015275 -0.0011526 -0.00451979 -0.00653248
0.01192301 -0.00078739 -0.00056679 -0.00057205 0.0220937 -0.00158271
-0.00026968 -0.00664996 -0.00085808 -0.00305471 0.00620055 0.0064344
-0.00938795 0.00266371 0.00623808 0.0083605 -0.00238177 -0.00048903
0.00059158 0.00824707 0.00500612 0.00873516 -0.0032067 0.00337419
0.01087511 0.004928 0.01195703 0.01690748 0.01420193 -0.0064415
0.00545023 0.01340502 -0.00258121 0.01323839 0.00632899 0.01284719
0.00555667 0.01261076 -0.00088008 0.01200596 0.00733639 0.01783392
-0.00440101 0.00118115 0.01178464 0.0074486 0.00896501 0.00357948
0.00705922 0.00520497 0.01415215 -0.00202574 0.00927804 0.0138014
0.0098721 0.0129296 0.00189565 0.01651774 0.00946718 -0.00534614
0.00506906 -0.00030766 -0.00026362 0.00419401 0.00212149 -0.00304823
-0.00427098 0.0041138 0.01505729 0.00112592 -0.00334759 0.00820872
-0.01345768 -0.00101386 -0.00698254 0.02179425 0.00819413 0.00404393
-0.00315165 0.01334981 0.01426365 0.00202925]
2、对保存模型进行冻结操作的注意事项:
1)、要想对保存模型进行冻结操作,建议使用 [结构保存(model.to_json()) + 权值保存(model.save_weights)] 这种方法保存模型。
# 正常模型的参数:
=================================================================
Total params: 2,100,362
Trainable params: 2,100,362
Non-trainable params: 0
_________________________________________________________________
原因:
> 采用 model.save() 和 load_model()的方法得到的模型,在做冻结操作时候会发生权重错误。
from keras.models import load_model
model1 = load_model('CIFAR10_model_epoch_1.h5')
model1.trainable = False
model1.summary()
# ———— 看看参与训练的权值都是什么————
print('参与训练的权值:')
for x in model1.trainable_weights:
print(x.name)
print('\n')
# —————————————————————————————————
【结果】
>>>
=================================================================
Total params: 4,200,724 (总权重咋就变多了?)
Trainable params: 2,100,362 (为什么还有可训练的权重????)
Non-trainable params: 2,100,362
_________________________________________________________________
参与训练的权值: (参与训练的权值倒是没有。奇怪奇怪!)
(无)
> 采用 [ 结构保存(model.to_json()) + 权值保存(model.save_weights) ],在做冻结操作时候就不会发生权重错误。
from keras.models import model_from_json
model1 = model_from_json(open('my_model_architecture.json').read())
model1.trainable = False
model1.load_weights('model_weight_epoch_1.h5')
model1.summary()
【结果】
>>>
=================================================================
Total params: 2,100,362
Trainable params: 0 (看!这个就没错!!)
Non-trainable params: 2,100,362
_________________________________________________________________
3、如果网络层的定义部分:
y = Dense(units=128, activation='relu', kernel_initializer='he_normal',trainable=False)(y)
定义了trainable=False,那么就不能通过model.trainable = True 来改变这一层的'冻结状态';
x = Input(shape=(32, 32, 3))
y = x
y = Convolution2D(filters=64, kernel_size=3, strides=1, padding='same', activation='relu', kernel_initializer='he_normal')(y)
y = MaxPooling2D(pool_size=2, strides=2, padding='valid')(y)
y = Flatten()(y)
y = Dense(units=128, activation='relu', kernel_initializer='he_normal',trainable=False)(y)
y = Dropout(0.5)(y)
y = Dense(units=nb_classes, activation='softmax', kernel_initializer='he_normal')(y)
model1 = Model(inputs=x, outputs=y, name='model1')
model1.trainable = True # 看!我让全部层都可以train了
model1.summary()
# ———— 看看不参与训练的权值都是什么————
print('不参与训练的权值:')
for x in model1.non_trainable_weights:
print(x.name)
print('\n')
# —————————————————————————————————
【结果】
>>>
=================================================================
Total params: 2,100,362
Trainable params: 3,082
Non-trainable params: 2,097,280 (看!还是有不能训练的参数)
_________________________________________________________________
不参与训练的权值: (看!这是不能训练的参数名称)
dense_1/kernel:0
dense_1/bias:0
但可以通过model.layers[4].trainable=True来改变:
x = Input(shape=(32, 32, 3))
y = x
y = Convolution2D(filters=64, kernel_size=3, strides=1, padding='same', activation='relu', kernel_initializer='he_normal')(y)
y = MaxPooling2D(pool_size=2, strides=2, padding='valid')(y)
y = Flatten()(y)
y = Dense(units=128, activation='relu', kernel_initializer='he_normal',trainable=False)(y)
y = Dropout(0.5)(y)
y = Dense(units=nb_classes, activation='softmax', kernel_initializer='he_normal')(y)
model1 = Model(inputs=x, outputs=y, name='model1')
model1.layers[4].trainable = True # 看!我让这个Dense128层可train了
model1.summary()
# ———— 看看不参与训练的权值都是什么————
print('不参与训练的权值:')
for x in model1.non_trainable_weights:
print(x.name)
print('\n')
# —————————————————————————————————
【结果】
>>>
=================================================================
Total params: 2,100,362
Trainable params: 2,100,362
Non-trainable params: 0 (看!没有了不能训练的参数!)
_________________________________________________________________
不参与训练的权值:
(无)
4、查看可训练(trainable)和不可训练(non_trainable)的权值方法:
方法:model.trainable_weights (可训练权值)
print('参与训练的权值名称:')
for x in model.trainable_weights:
print(x.name)
print('\n')
方法:model.non_trainable_weights (不可训练权值)
print('不参与训练的权值名称:')
for x in model.non_trainable_weights:
print(x.name)
print('\n')
二、抽取某层输出
https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer
# ■■■■■■■■ [2]模型设计 ■■■■■■■■
# ———— 主模型
inputs = Input(shape=(784,))
x = Dense(100,activation='relu')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)
model_main.load_weights('my_model_weights.h5')
# model_main.trainable = False
# model = model_main
# ———— 提取'dense_1'层的输出。
layer_name = 'dense_1'
intermediate_layer_model = Model(input=inputs,
output=model_main.get_layer(layer_name).output)
x_train_Dense = intermediate_layer_model.predict(x_train) # 将样本x_train输入得到'dense_1'层输出。
print('x_train_Dense',x_train_Dense)
print('x_train_Dense.shape',x_train_Dense.shape)
三、 fine-tune
http://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
(1) 、主模型固定(不是冻结)不训练的fine-tune
import numpy as np
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.models import Model
from keras.layers import Input,Dense,Conv2D,Activation,MaxPooling2D,Flatten,merge,Conv2DTranspose,ZeroPadding2D
from keras.regularizers import l2
from keras.layers import Dense
from keras.optimizers import SGD
from keras import backend as K
# ■■■■■■■■ [1] 数据载入 ■■■■■■■■
(x_train,y_train),(x_test,y_test) = mnist.load_data()
# (60000,28,28)
print('x_shape:',x_train.shape)
# (60000)
print('y_shape:',y_train.shape)
# (60000,28,28)->(60000,784)
x_train = x_train.reshape(x_train.shape[0],-1)/255.0
x_test = x_test.reshape(x_test.shape[0],-1)/255.0
# 换one hot格式
y_train = np_utils.to_categorical(y_train,num_classes=10)
y_test = np_utils.to_categorical(y_test,num_classes=10)
# ■■■■■■■■ [2]模型设计 ■■■■■■■■
# ———— 主模型
inputs = Input(shape=(784,))
x = Dense(100,activation='relu')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)
model_main.load_weights('my_model_weights.h5')
# model_main.trainable = False
# model = model_main
# ———— 提取'dense_1'层的输出。
layer_name = 'dense_1'
intermediate_layer_model = Model(input=inputs,
output=model_main.get_layer(layer_name).output)
x_train_Dense = intermediate_layer_model.predict(x_train) # 将样本x_train输入得到'dense_1'层输出。
np.save('bottleneck_features.npy', x_train_Dense) # 将提取出的'dense_1'层的特征保存在.npy文件中。
train_data = np.load('bottleneck_features.npy') # 读取.npy文件中的特征向量。
print('x_train_Dense',train_data)
print('x_train_Dense.shape',train_data.shape)
# ———— fine-tune模型
inputs1 = Input(shape=(100,)) # 由于取出上面model_main中的'dense_1'层输出为100维。
x = Dense(100,activation='relu')(inputs1)
outputs1 = Dense(10,activation='softmax')(x)
model = Model(inputs1,outputs1)
model.summary()
# ■■■■■■■■ [3]模型编译 ■■■■■■■■
# 定义优化器
sgd = SGD(lr=0.2)
# 编译,loss function,训练过程中计算准确率
model.compile(optimizer = sgd,
loss = 'mse',
metrics=['accuracy'],
)
# ■■■■■■■■ [4]训练模型 ■■■■■■■■
# model.fit(x_train,y_train,batch_size=64,epochs=1) # 主模型训练用这个
model.fit(x_train_Dense, y_train,batch_size=64, epochs=1) # fine-tune模型用这个
# ■■■■■■■■ [5]评估模型 ■■■■■■■■
# loss,accuracy = model.evaluate(x_test,y_test)
# print('\ntest loss',loss)
# print('accuracy',accuracy)
# # 保存参数,载入参数
# model.save_weights('my_model_weights.h5')
# model.load_weights('my_model_weights.h5')
K.clear_session()
(2)、主模型不冻结的fine-tune——并且可以验证主模型load_weight之后,连成fine-tune模型后,权重是否也在fine-tune模型中。
# ■■■■■■■■ [2]模型设计 ■■■■■■■■
# —————— 主模型 ——————
inputs = Input(shape=(784,))
x = Dense(100,activation='relu',name='dense_100')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)
model_main.load_weights('my_model_weights.h5')
# ———— (fine-tune构建)提取'dense_1'层的输出 ——————
layer_name = 'dense_100'
intermediate_layer_model = Model(input=inputs,
output=model_main.get_layer(layer_name).output)
outputs_inter = Dense(10,activation='softmax')(intermediate_layer_model.output)
model_inter = Model(input=inputs, output=outputs_inter)
model = model_inter
# ****** 权重探针 ********
a = model.get_weights()
print('主模型load_weight之后,去掉softmax层,接成fine-tune模型后的原dense_100层权重:', a[1]) # 'dense_100'那层的权重。
# ****** 权重探针 ********
【结果】(发现确实load进了fine-tune模型)
>>>主模型load_weight之后,去掉softmax层,接成fine-tune模型后的原dense_100层权重:
[ 0.00609367 0.01774433 0.00127991 0.01685369 -0.00588948 0.0022781
0.00694803 0.00636634 -0.00108383 -0.00480387 0.01123319 0.01685128
0.0071973 0.00373418 0.0015275 -0.0011526 -0.00451979 -0.00653248
0.01192301 -0.00078739 -0.00056679 -0.00057205 0.0220937 -0.00158271
-0.00026968 -0.00664996 -0.00085808 -0.00305471 0.00620055 0.0064344
-0.00938795 0.00266371 0.00623808 0.0083605 -0.00238177 -0.00048903
0.00059158 0.00824707 0.00500612 0.00873516 -0.0032067 0.00337419
0.01087511 0.004928 0.01195703 0.01690748 0.01420193 -0.0064415
0.00545023 0.01340502 -0.00258121 0.01323839 0.00632899 0.01284719
0.00555667 0.01261076 -0.00088008 0.01200596 0.00733639 0.01783392
-0.00440101 0.00118115 0.01178464 0.0074486 0.00896501 0.00357948
0.00705922 0.00520497 0.01415215 -0.00202574 0.00927804 0.0138014
0.0098721 0.0129296 0.00189565 0.01651774 0.00946718 -0.00534614
0.00506906 -0.00030766 -0.00026362 0.00419401 0.00212149 -0.00304823
-0.00427098 0.0041138 0.01505729 0.00112592 -0.00334759 0.00820872
-0.01345768 -0.00101386 -0.00698254 0.02179425 0.00819413 0.00404393
-0.00315165 0.01334981 0.01426365 0.00202925]
(3)、让模型中的某几层不参加训练。
model.layers方法:
# ——————————————————————— 主模型 ——————————————————————————
#....省略....
model1 = Model(inputs=x, outputs=y, name='model1')
# ——————————————— 只想让后3层参加训练(总共14层) ——————————————
print('\n 有多少个层(relu这种没有参数的也算一层) :',len(model1.layers))
model1.trainable = True # 想要让某层参加训练,必须'先'让全部层[可训练],'再'让不想参加训练的层[冻结].
# 让不想参加训练的层[冻结].
for layer in model1.layers[:11]:
layer.trainable = False
model1.summary()
# ————————————————————————————————————————————————————————————
【结果】
>>>
Total params: 1,671,114
Trainable params: 525,706
Non-trainable params: 1,145,408