解决利用keras的InceptionV3、ResNet50模型做迁移学习训练集和验证集的准确率相差很大的问题

kaggle的人类蛋白图谱图像分类的比赛告一段落了,终于有时间闲下来写写这一路走来填的坑了。

keras的版本是2.2.4

有没有小伙伴遇到过用keras的InceptionV3、ResNet50等含有BN层的模型做迁移学习训练集和验证集结果相差很大的问题,例如下面这样:

Epoch 1/20
1500/1500 [==============================] - 24s 16ms/step - loss: 2.1168 - binary_accuracy: 0.9169 - f1_keras: 0.0617 - val_loss: 2.2727 - val_binary_accuracy: 0.9258 - val_f1_keras: 0.0377
Epoch 2/20
1500/1500 [==============================] - 19s 13ms/step - loss: 1.1976 - binary_accuracy: 0.9480 - f1_keras: 0.1084 - val_loss: 2.4163 - val_binary_accuracy: 0.9218 - val_f1_keras: 0.0356
Epoch 3/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.9935 - binary_accuracy: 0.9540 - f1_keras: 0.1608 - val_loss: 2.7485 - val_binary_accuracy: 0.9114 - val_f1_keras: 0.0359
Epoch 4/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.8294 - binary_accuracy: 0.9572 - f1_keras: 0.1902 - val_loss: 2.9039 - val_binary_accuracy: 0.9166 - val_f1_keras: 0.0402
Epoch 5/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.7250 - binary_accuracy: 0.9606 - f1_keras: 0.2482 - val_loss: 3.1574 - val_binary_accuracy: 0.9057 - val_f1_keras: 0.0485

可以看出,模型的训练集loss在一直减小,但是验证集的loss却一直增大,而且验证集的准确率和f1分数也与训练集的结果大相径庭。有小伙伴会怀疑会不会是过拟合了,楼主也曾这样怀疑过,所以楼主将验证集用训练集代替又做了次实验,也就是说训练集和验证集都是相同的样本集,这样一来得到的预期结果应该是训练集和验证集的结果都应该相同才对。但是却得到了跟上面几乎相同的结果。

楼主又用Vgg-19模型代替InceptionV3做了相同的实验,Vgg-19等不含有BN层的模型并未出现此问题。因此楼主怀疑是BN层搞得鬼,通过查找资料发现问题出在了建造模型的代码上。先给出错误的模型建造的代码(我个人的愚见,若我讲的不对,希望大神能够指出),下面的代码是keras官方给出的,楼主上面的结果就是用这个建造模型的代码结构(结构是一样的,内容稍有差别)跑出来的。

from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K

# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)

# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
    layer.trainable = False

# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# train the model on the new data for a few epochs
model.fit_generator(...)

运行下model.summary()看一下模型结构:

activation_20 (Activation)      (None, None, None, 6 0           batch_normalization_20[0][0]     
__________________________________________________________________________________________________
activation_22 (Activation)      (None, None, None, 6 0           batch_normalization_22[0][0]     
__________________________________________________________________________________________________
activation_25 (Activation)      (None, None, None, 9 0           batch_normalization_25[0][0]     
__________________________________________________________________________________________________
activation_26 (Activation)      (None, None, None, 6 0           batch_normalization_26[0][0]     
__________________________________________________________________________________________________
mixed2 (Concatenate)            (None, None, None, 2 0           activation_20[0][0]              
                                                                 activation_22[0][0]              
                                                                 activation_25[0][0]              
                                                                 activation_26[0][0]              
__________________________________________________________________________________________________
conv2d_28 (Conv2D)              (None, None, None, 6 18432       mixed2[0][0]                     
__________________________________________________________________________________________________
batch_normalization_28 (BatchNo (None, None, None, 6 192         conv2d_28[0][0]                  
__________________________________________________________________________________________________
activation_28 (Activation)      (None, None, None, 6 0           batch_normalization_28[0][0]     
__________________________________________________________________________________________________
conv2d_29 (Conv2D)              (None, None, None, 9 55296       activation_28[0][0]              
__________________________________________________________________________________________________
batch_normalization_29 (BatchNo (None, None, None, 9 288         conv2d_29[0][0]                  
__________________________________________________________________________________________________
activation_29 (Activation)      (None, None, None, 9 0           batch_normalization_29[0][0]     
__________________________________________________________________________________________________
conv2d_27 (Conv2D)              (None, None, None, 3 995328      mixed2[0][0]                     
__________________________________________________________________________________________________
conv2d_30 (Conv2D)              (None, None, None, 9 82944       activation_29[0][0]              
__________________________________________________________________________________________________
batch_normalization_27 (BatchNo (None, None, None, 3 1152        conv2d_27[0][0]                  
__________________________________________________________________________________________________
batch_normalization_30 (BatchNo (None, None, None, 9 288         conv2d_30[0][0]                  
__________________________________________________________________________________________________
activation_27 (Activation)      (None, None, None, 3 0           batch_normalization_27[0][0]     
__________________________________________________________________________________________________
activation_30 (Activation)      (None, None, None, 9 0           batch_normalization_30[0][0]     
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, None, None, 2 0           mixed2[0][0]                     
__________________________________________________________________________________________________
mixed3 (Concatenate)            (None, None, None, 7 0           activation_27[0][0]              
                                                                 activation_30[0][0]              
                                                                 max_pooling2d_3[0][0]            
__________________________________________________________________________________________________
conv2d_35 (Conv2D)              (None, None, None, 1 98304       mixed3[0][0]                     
__________________________________________________________________________________________________
batch_normalization_35 (BatchNo (None, None, None, 1 384         conv2d_35[0][0]                  
__________________________________________________________________________________________________
activation_35 (Activation)      (None, None, None, 1 0           batch_normalization_35[0][0]     
__________________________________________________________________________________________________
conv2d_36 (Conv2D)              (None, None, None, 1 114688      activation_35[0][0]              
__________________________________________________________________________________________________
batch_normalization_36 (BatchNo (None, None, None, 1 384         conv2d_36[0][0]                  
__________________________________________________________________________________________________
activation_36 (Activation)      (None, None, None, 1 0           batch_normalization_36[0][0]     
__________________________________________________________________________________________________
conv2d_32 (Conv2D)              (None, None, None, 1 98304       mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_37 (Conv2D)              (None, None, None, 1 114688      activation_36[0][0]              
__________________________________________________________________________________________________
batch_normalization_32 (BatchNo (None, None, None, 1 384         conv2d_32[0][0]                  
__________________________________________________________________________________________________
batch_normalization_37 (BatchNo (None, None, None, 1 384         conv2d_37[0][0]                  
__________________________________________________________________________________________________
activation_32 (Activation)      (None, None, None, 1 0           batch_normalization_32[0][0]     
__________________________________________________________________________________________________
activation_37 (Activation)      (None, None, None, 1 0           batch_normalization_37[0][0]     
__________________________________________________________________________________________________
conv2d_33 (Conv2D)              (None, None, None, 1 114688      activation_32[0][0]              
__________________________________________________________________________________________________
conv2d_38 (Conv2D)              (None, None, None, 1 114688      activation_37[0][0]              
__________________________________________________________________________________________________
batch_normalization_33 (BatchNo (None, None, None, 1 384         conv2d_33[0][0]                  
__________________________________________________________________________________________________
batch_normalization_38 (BatchNo (None, None, None, 1 384         conv2d_38[0][0]                  
__________________________________________________________________________________________________
activation_33 (Activation)      (None, None, None, 1 0           batch_normalization_33[0][0]     
__________________________________________________________________________________________________
activation_38 (Activation)      (None, None, None, 1 0           batch_normalization_38[0][0]     
__________________________________________________________________________________________________
average_pooling2d_4 (AveragePoo (None, None, None, 7 0           mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_31 (Conv2D)              (None, None, None, 1 147456      mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_34 (Conv2D)              (None, None, None, 1 172032      activation_33[0][0]              
__________________________________________________________________________________________________
conv2d_39 (Conv2D)              (None, None, None, 1 172032      activation_38[0][0]              
__________________________________________________________________________________________________
conv2d_40 (Conv2D)              (None, None, None, 1 147456      average_pooling2d_4[0][0]        
__________________________________________________________________________________________________
batch_normalization_31 (BatchNo (None, None, None, 1 576         conv2d_31[0][0]                  
__________________________________________________________________________________________________
batch_normalization_34 (BatchNo (None, None, None, 1 576         conv2d_34[0][0]                  
__________________________________________________________________________________________________
batch_normalization_39 (BatchNo (None, None, None, 1 576         conv2d_39[0][0]                  
__________________________________________________________________________________________________
batch_normalization_40 (BatchNo (None, None, None, 1 576         conv2d_40[0][0]                  
__________________________________________________________________________________________________
activation_31 (Activation)      (None, None, None, 1 0           batch_normalization_31[0][0]     
__________________________________________________________________________________________________
activation_34 (Activation)      (None, None, None, 1 0           batch_normalization_34[0][0]     
__________________________________________________________________________________________________
activation_39 (Activation)      (None, None, None, 1 0           batch_normalization_39[0][0]     
__________________________________________________________________________________________________
activation_40 (Activation)      (None, None, None, 1 0           batch_normalization_40[0][0]     
__________________________________________________________________________________________________
mixed4 (Concatenate)            (None, None, None, 7 0           activation_31[0][0]              
                                                                 activation_34[0][0]              
                                                                 activation_39[0][0]              
                                                                 activation_40[0][0]              
__________________________________________________________________________________________________
conv2d_45 (Conv2D)              (None, None, None, 1 122880      mixed4[0][0]                     
__________________________________________________________________________________________________
batch_normalization_45 (BatchNo (None, None, None, 1 480         conv2d_45[0][0]                  
__________________________________________________________________________________________________
activation_45 (Activation)      (None, None, None, 1 0           batch_normalization_45[0][0]     
__________________________________________________________________________________________________
conv2d_46 (Conv2D)              (None, None, None, 1 179200      activation_45[0][0]              
__________________________________________________________________________________________________
batch_normalization_46 (BatchNo (None, None, None, 1 480         conv2d_46[0][0]                  
__________________________________________________________________________________________________
activation_46 (Activation)      (None, None, None, 1 0           batch_normalization_46[0][0]     
__________________________________________________________________________________________________
conv2d_42 (Conv2D)              (None, None, None, 1 122880      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_47 (Conv2D)              (None, None, None, 1 179200      activation_46[0][0]              
__________________________________________________________________________________________________
batch_normalization_42 (BatchNo (None, None, None, 1 480         conv2d_42[0][0]                  
__________________________________________________________________________________________________
batch_normalization_47 (BatchNo (None, None, None, 1 480         conv2d_47[0][0]                  
__________________________________________________________________________________________________
activation_42 (Activation)      (None, None, None, 1 0           batch_normalization_42[0][0]     
__________________________________________________________________________________________________
activation_47 (Activation)      (None, None, None, 1 0           batch_normalization_47[0][0]     
__________________________________________________________________________________________________
conv2d_43 (Conv2D)              (None, None, None, 1 179200      activation_42[0][0]              
__________________________________________________________________________________________________
conv2d_48 (Conv2D)              (None, None, None, 1 179200      activation_47[0][0]              
__________________________________________________________________________________________________
batch_normalization_43 (BatchNo (None, None, None, 1 480         conv2d_43[0][0]                  
__________________________________________________________________________________________________
batch_normalization_48 (BatchNo (None, None, None, 1 480         conv2d_48[0][0]                  
__________________________________________________________________________________________________
activation_43 (Activation)      (None, None, None, 1 0           batch_normalization_43[0][0]     
__________________________________________________________________________________________________
activation_48 (Activation)      (None, None, None, 1 0           batch_normalization_48[0][0]     
__________________________________________________________________________________________________
average_pooling2d_5 (AveragePoo (None, None, None, 7 0           mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_41 (Conv2D)              (None, None, None, 1 147456      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_44 (Conv2D)              (None, None, None, 1 215040      activation_43[0][0]              
__________________________________________________________________________________________________
conv2d_49 (Conv2D)              (None, None, None, 1 215040      activation_48[0][0]              
__________________________________________________________________________________________________
conv2d_50 (Conv2D)              (None, None, None, 1 147456      average_pooling2d_5[0][0]        
__________________________________________________________________________________________________
batch_normalization_41 (BatchNo (None, None, None, 1 576         conv2d_41[0][0]                  
__________________________________________________________________________________________________
batch_normalization_44 (BatchNo (None, None, None, 1 576         conv2d_44[0][0]                  
__________________________________________________________________________________________________
batch_normalization_49 (BatchNo (None, None, None, 1 576         conv2d_49[0][0]                  
__________________________________________________________________________________________________
batch_normalization_50 (BatchNo (None, None, None, 1 576         conv2d_50[0][0]                  
__________________________________________________________________________________________________
activation_41 (Activation)      (None, None, None, 1 0           batch_normalization_41[0][0]     
__________________________________________________________________________________________________
activation_44 (Activation)      (None, None, None, 1 0           batch_normalization_44[0][0]     
__________________________________________________________________________________________________
activation_49 (Activation)      (None, None, None, 1 0           batch_normalization_49[0][0]     
__________________________________________________________________________________________________
activation_50 (Activation)      (None, None, None, 1 0           batch_normalization_50[0][0]     
__________________________________________________________________________________________________
mixed5 (Concatenate)            (None, None, None, 7 0           activation_41[0][0]              
                                                                 activation_44[0][0]              
                                                                 activation_49[0][0]              
                                                                 activation_50[0][0]              
__________________________________________________________________________________________________
conv2d_55 (Conv2D)              (None, None, None, 1 122880      mixed5[0][0]                     
__________________________________________________________________________________________________
batch_normalization_55 (BatchNo (None, None, None, 1 480         conv2d_55[0][0]                  
__________________________________________________________________________________________________
activation_55 (Activation)      (None, None, None, 1 0           batch_normalization_55[0][0]     
__________________________________________________________________________________________________
conv2d_56 (Conv2D)              (None, None, None, 1 179200      activation_55[0][0]              
__________________________________________________________________________________________________
batch_normalization_56 (BatchNo (None, None, None, 1 480         conv2d_56[0][0]                  
__________________________________________________________________________________________________
activation_56 (Activation)      (None, None, None, 1 0           batch_normalization_56[0][0]     
__________________________________________________________________________________________________
conv2d_52 (Conv2D)              (None, None, None, 1 122880      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_57 (Conv2D)              (None, None, None, 1 179200      activation_56[0][0]              
__________________________________________________________________________________________________
batch_normalization_52 (BatchNo (None, None, None, 1 480         conv2d_52[0][0]                  
__________________________________________________________________________________________________
batch_normalization_57 (BatchNo (None, None, None, 1 480         conv2d_57[0][0]                  
__________________________________________________________________________________________________
activation_52 (Activation)      (None, None, None, 1 0           batch_normalization_52[0][0]     
__________________________________________________________________________________________________
activation_57 (Activation)      (None, None, None, 1 0           batch_normalization_57[0][0]     
__________________________________________________________________________________________________
conv2d_53 (Conv2D)              (None, None, None, 1 179200      activation_52[0][0]              
__________________________________________________________________________________________________
conv2d_58 (Conv2D)              (None, None, None, 1 179200      activation_57[0][0]              
__________________________________________________________________________________________________
batch_normalization_53 (BatchNo (None, None, None, 1 480         conv2d_53[0][0]                  
__________________________________________________________________________________________________
batch_normalization_58 (BatchNo (None, None, None, 1 480         conv2d_58[0][0]                  
__________________________________________________________________________________________________
activation_53 (Activation)      (None, None, None, 1 0           batch_normalization_53[0][0]     
__________________________________________________________________________________________________
activation_58 (Activation)      (None, None, None, 1 0           batch_normalization_58[0][0]     
__________________________________________________________________________________________________
average_pooling2d_6 (AveragePoo (None, None, None, 7 0           mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_51 (Conv2D)              (None, None, None, 1 147456      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_54 (Conv2D)              (None, None, None, 1 215040      activation_53[0][0]              
__________________________________________________________________________________________________
conv2d_59 (Conv2D)              (None, None, None, 1 215040      activation_58[0][0]              
__________________________________________________________________________________________________
conv2d_60 (Conv2D)              (None, None, None, 1 147456      average_pooling2d_6[0][0]        
__________________________________________________________________________________________________
batch_normalization_51 (BatchNo (None, None, None, 1 576         conv2d_51[0][0]                  
__________________________________________________________________________________________________
batch_normalization_54 (BatchNo (None, None, None, 1 576         conv2d_54[0][0]                  
__________________________________________________________________________________________________
batch_normalization_59 (BatchNo (None, None, None, 1 576         conv2d_59[0][0]                  
__________________________________________________________________________________________________
batch_normalization_60 (BatchNo (None, None, None, 1 576         conv2d_60[0][0]                  
__________________________________________________________________________________________________
activation_51 (Activation)      (None, None, None, 1 0           batch_normalization_51[0][0]     
__________________________________________________________________________________________________
activation_54 (Activation)      (None, None, None, 1 0           batch_normalization_54[0][0]     
__________________________________________________________________________________________________
activation_59 (Activation)      (None, None, None, 1 0           batch_normalization_59[0][0]     
__________________________________________________________________________________________________
activation_60 (Activation)      (None, None, None, 1 0           batch_normalization_60[0][0]     
__________________________________________________________________________________________________
mixed6 (Concatenate)            (None, None, None, 7 0           activation_51[0][0]              
                                                                 activation_54[0][0]              
                                                                 activation_59[0][0]              
                                                                 activation_60[0][0]              
__________________________________________________________________________________________________
conv2d_65 (Conv2D)              (None, None, None, 1 147456      mixed6[0][0]                     
__________________________________________________________________________________________________
batch_normalization_65 (BatchNo (None, None, None, 1 576         conv2d_65[0][0]                  
__________________________________________________________________________________________________
activation_65 (Activation)      (None, None, None, 1 0           batch_normalization_65[0][0]     
__________________________________________________________________________________________________
conv2d_66 (Conv2D)              (None, None, None, 1 258048      activation_65[0][0]              
__________________________________________________________________________________________________
batch_normalization_66 (BatchNo (None, None, None, 1 576         conv2d_66[0][0]                  
__________________________________________________________________________________________________
activation_66 (Activation)      (None, None, None, 1 0           batch_normalization_66[0][0]     
__________________________________________________________________________________________________
conv2d_62 (Conv2D)              (None, None, None, 1 147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_67 (Conv2D)              (None, None, None, 1 258048      activation_66[0][0]              
__________________________________________________________________________________________________
batch_normalization_62 (BatchNo (None, None, None, 1 576         conv2d_62[0][0]                  
__________________________________________________________________________________________________
batch_normalization_67 (BatchNo (None, None, None, 1 576         conv2d_67[0][0]                  
__________________________________________________________________________________________________
activation_62 (Activation)      (None, None, None, 1 0           batch_normalization_62[0][0]     
__________________________________________________________________________________________________
activation_67 (Activation)      (None, None, None, 1 0           batch_normalization_67[0][0]     
__________________________________________________________________________________________________
conv2d_63 (Conv2D)              (None, None, None, 1 258048      activation_62[0][0]              
__________________________________________________________________________________________________
conv2d_68 (Conv2D)              (None, None, None, 1 258048      activation_67[0][0]              
__________________________________________________________________________________________________
batch_normalization_63 (BatchNo (None, None, None, 1 576         conv2d_63[0][0]                  
__________________________________________________________________________________________________
batch_normalization_68 (BatchNo (None, None, None, 1 576         conv2d_68[0][0]                  
__________________________________________________________________________________________________
activation_63 (Activation)      (None, None, None, 1 0           batch_normalization_63[0][0]     
__________________________________________________________________________________________________
activation_68 (Activation)      (None, None, None, 1 0           batch_normalization_68[0][0]     
__________________________________________________________________________________________________
average_pooling2d_7 (AveragePoo (None, None, None, 7 0           mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_61 (Conv2D)              (None, None, None, 1 147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_64 (Conv2D)              (None, None, None, 1 258048      activation_63[0][0]              
__________________________________________________________________________________________________
conv2d_69 (Conv2D)              (None, None, None, 1 258048      activation_68[0][0]              
__________________________________________________________________________________________________
conv2d_70 (Conv2D)              (None, None, None, 1 147456      average_pooling2d_7[0][0]        
__________________________________________________________________________________________________
batch_normalization_61 (BatchNo (None, None, None, 1 576         conv2d_61[0][0]                  
__________________________________________________________________________________________________
batch_normalization_64 (BatchNo (None, None, None, 1 576         conv2d_64[0][0]                  
__________________________________________________________________________________________________
batch_normalization_69 (BatchNo (None, None, None, 1 576         conv2d_69[0][0]                  
__________________________________________________________________________________________________
batch_normalization_70 (BatchNo (None, None, None, 1 576         conv2d_70[0][0]                  
__________________________________________________________________________________________________
activation_61 (Activation)      (None, None, None, 1 0           batch_normalization_61[0][0]     
__________________________________________________________________________________________________
activation_64 (Activation)      (None, None, None, 1 0           batch_normalization_64[0][0]     
__________________________________________________________________________________________________
activation_69 (Activation)      (None, None, None, 1 0           batch_normalization_69[0][0]     
__________________________________________________________________________________________________
activation_70 (Activation)      (None, None, None, 1 0           batch_normalization_70[0][0]     
__________________________________________________________________________________________________
mixed7 (Concatenate)            (None, None, None, 7 0           activation_61[0][0]              
                                                                 activation_64[0][0]              
                                                                 activation_69[0][0]              
                                                                 activation_70[0][0]              
__________________________________________________________________________________________________
conv2d_73 (Conv2D)              (None, None, None, 1 147456      mixed7[0][0]                     
__________________________________________________________________________________________________
batch_normalization_73 (BatchNo (None, None, None, 1 576         conv2d_73[0][0]                  
__________________________________________________________________________________________________
activation_73 (Activation)      (None, None, None, 1 0           batch_normalization_73[0][0]     
__________________________________________________________________________________________________
conv2d_74 (Conv2D)              (None, None, None, 1 258048      activation_73[0][0]              
__________________________________________________________________________________________________
batch_normalization_74 (BatchNo (None, None, None, 1 576         conv2d_74[0][0]                  
__________________________________________________________________________________________________
activation_74 (Activation)      (None, None, None, 1 0           batch_normalization_74[0][0]     
__________________________________________________________________________________________________
conv2d_71 (Conv2D)              (None, None, None, 1 147456      mixed7[0][0]                     
__________________________________________________________________________________________________
conv2d_75 (Conv2D)              (None, None, None, 1 258048      activation_74[0][0]              
__________________________________________________________________________________________________
batch_normalization_71 (BatchNo (None, None, None, 1 576         conv2d_71[0][0]                  
__________________________________________________________________________________________________
batch_normalization_75 (BatchNo (None, None, None, 1 576         conv2d_75[0][0]                  
__________________________________________________________________________________________________
activation_71 (Activation)      (None, None, None, 1 0           batch_normalization_71[0][0]     
__________________________________________________________________________________________________
activation_75 (Activation)      (None, None, None, 1 0           batch_normalization_75[0][0]     
__________________________________________________________________________________________________
conv2d_72 (Conv2D)              (None, None, None, 3 552960      activation_71[0][0]              
__________________________________________________________________________________________________
conv2d_76 (Conv2D)              (None, None, None, 1 331776      activation_75[0][0]              
__________________________________________________________________________________________________
batch_normalization_72 (BatchNo (None, None, None, 3 960         conv2d_72[0][0]                  
__________________________________________________________________________________________________
batch_normalization_76 (BatchNo (None, None, None, 1 576         conv2d_76[0][0]                  
__________________________________________________________________________________________________
activation_72 (Activation)      (None, None, None, 3 0           batch_normalization_72[0][0]     
__________________________________________________________________________________________________
activation_76 (Activation)      (None, None, None, 1 0           batch_normalization_76[0][0]     
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, None, None, 7 0           mixed7[0][0]                     
__________________________________________________________________________________________________
mixed8 (Concatenate)            (None, None, None, 1 0           activation_72[0][0]              
                                                                 activation_76[0][0]              
                                                                 max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
conv2d_81 (Conv2D)              (None, None, None, 4 573440      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_81 (BatchNo (None, None, None, 4 1344        conv2d_81[0][0]                  
__________________________________________________________________________________________________
activation_81 (Activation)      (None, None, None, 4 0           batch_normalization_81[0][0]     
__________________________________________________________________________________________________
conv2d_78 (Conv2D)              (None, None, None, 3 491520      mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_82 (Conv2D)              (None, None, None, 3 1548288     activation_81[0][0]              
__________________________________________________________________________________________________
batch_normalization_78 (BatchNo (None, None, None, 3 1152        conv2d_78[0][0]                  
__________________________________________________________________________________________________
batch_normalization_82 (BatchNo (None, None, None, 3 1152        conv2d_82[0][0]                  
__________________________________________________________________________________________________
activation_78 (Activation)      (None, None, None, 3 0           batch_normalization_78[0][0]     
__________________________________________________________________________________________________
activation_82 (Activation)      (None, None, None, 3 0           batch_normalization_82[0][0]     
__________________________________________________________________________________________________
conv2d_79 (Conv2D)              (None, None, None, 3 442368      activation_78[0][0]              
__________________________________________________________________________________________________
conv2d_80 (Conv2D)              (None, None, None, 3 442368      activation_78[0][0]              
__________________________________________________________________________________________________
conv2d_83 (Conv2D)              (None, None, None, 3 442368      activation_82[0][0]              
__________________________________________________________________________________________________
conv2d_84 (Conv2D)              (None, None, None, 3 442368      activation_82[0][0]              
__________________________________________________________________________________________________
average_pooling2d_8 (AveragePoo (None, None, None, 1 0           mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_77 (Conv2D)              (None, None, None, 3 409600      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_79 (BatchNo (None, None, None, 3 1152        conv2d_79[0][0]                  
__________________________________________________________________________________________________
batch_normalization_80 (BatchNo (None, None, None, 3 1152        conv2d_80[0][0]                  
__________________________________________________________________________________________________
batch_normalization_83 (BatchNo (None, None, None, 3 1152        conv2d_83[0][0]                  
__________________________________________________________________________________________________
batch_normalization_84 (BatchNo (None, None, None, 3 1152        conv2d_84[0][0]                  
__________________________________________________________________________________________________
conv2d_85 (Conv2D)              (None, None, None, 1 245760      average_pooling2d_8[0][0]        
__________________________________________________________________________________________________
batch_normalization_77 (BatchNo (None, None, None, 3 960         conv2d_77[0][0]                  
__________________________________________________________________________________________________
activation_79 (Activation)      (None, None, None, 3 0           batch_normalization_79[0][0]     
__________________________________________________________________________________________________
activation_80 (Activation)      (None, None, None, 3 0           batch_normalization_80[0][0]     
__________________________________________________________________________________________________
activation_83 (Activation)      (None, None, None, 3 0           batch_normalization_83[0][0]     
__________________________________________________________________________________________________
activation_84 (Activation)      (None, None, None, 3 0           batch_normalization_84[0][0]     
__________________________________________________________________________________________________
batch_normalization_85 (BatchNo (None, None, None, 1 576         conv2d_85[0][0]                  
__________________________________________________________________________________________________
activation_77 (Activation)      (None, None, None, 3 0           batch_normalization_77[0][0]     
__________________________________________________________________________________________________
mixed9_0 (Concatenate)          (None, None, None, 7 0           activation_79[0][0]              
                                                                 activation_80[0][0]              
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, None, None, 7 0           activation_83[0][0]              
                                                                 activation_84[0][0]              
__________________________________________________________________________________________________
activation_85 (Activation)      (None, None, None, 1 0           batch_normalization_85[0][0]     
__________________________________________________________________________________________________
mixed9 (Concatenate)            (None, None, None, 2 0           activation_77[0][0]              
                                                                 mixed9_0[0][0]                   
                                                                 concatenate_1[0][0]              
                                                                 activation_85[0][0]              
__________________________________________________________________________________________________
conv2d_90 (Conv2D)              (None, None, None, 4 917504      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_90 (BatchNo (None, None, None, 4 1344        conv2d_90[0][0]                  
__________________________________________________________________________________________________
activation_90 (Activation)      (None, None, None, 4 0           batch_normalization_90[0][0]     
__________________________________________________________________________________________________
conv2d_87 (Conv2D)              (None, None, None, 3 786432      mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_91 (Conv2D)              (None, None, None, 3 1548288     activation_90[0][0]              
__________________________________________________________________________________________________
batch_normalization_87 (BatchNo (None, None, None, 3 1152        conv2d_87[0][0]                  
__________________________________________________________________________________________________
batch_normalization_91 (BatchNo (None, None, None, 3 1152        conv2d_91[0][0]                  
__________________________________________________________________________________________________
activation_87 (Activation)      (None, None, None, 3 0           batch_normalization_87[0][0]     
__________________________________________________________________________________________________
activation_91 (Activation)      (None, None, None, 3 0           batch_normalization_91[0][0]     
__________________________________________________________________________________________________
conv2d_88 (Conv2D)              (None, None, None, 3 442368      activation_87[0][0]              
__________________________________________________________________________________________________
conv2d_89 (Conv2D)              (None, None, None, 3 442368      activation_87[0][0]              
__________________________________________________________________________________________________
conv2d_92 (Conv2D)              (None, None, None, 3 442368      activation_91[0][0]              
__________________________________________________________________________________________________
conv2d_93 (Conv2D)              (None, None, None, 3 442368      activation_91[0][0]              
__________________________________________________________________________________________________
average_pooling2d_9 (AveragePoo (None, None, None, 2 0           mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_86 (Conv2D)              (None, None, None, 3 655360      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_88 (BatchNo (None, None, None, 3 1152        conv2d_88[0][0]                  
__________________________________________________________________________________________________
batch_normalization_89 (BatchNo (None, None, None, 3 1152        conv2d_89[0][0]                  
__________________________________________________________________________________________________
batch_normalization_92 (BatchNo (None, None, None, 3 1152        conv2d_92[0][0]                  
__________________________________________________________________________________________________
batch_normalization_93 (BatchNo (None, None, None, 3 1152        conv2d_93[0][0]                  
__________________________________________________________________________________________________
conv2d_94 (Conv2D)              (None, None, None, 1 393216      average_pooling2d_9[0][0]        
__________________________________________________________________________________________________
batch_normalization_86 (BatchNo (None, None, None, 3 960         conv2d_86[0][0]                  
__________________________________________________________________________________________________
activation_88 (Activation)      (None, None, None, 3 0           batch_normalization_88[0][0]     
__________________________________________________________________________________________________
activation_89 (Activation)      (None, None, None, 3 0           batch_normalization_89[0][0]     
__________________________________________________________________________________________________
activation_92 (Activation)      (None, None, None, 3 0           batch_normalization_92[0][0]     
__________________________________________________________________________________________________
activation_93 (Activation)      (None, None, None, 3 0           batch_normalization_93[0][0]     
__________________________________________________________________________________________________
batch_normalization_94 (BatchNo (None, None, None, 1 576         conv2d_94[0][0]                  
__________________________________________________________________________________________________
activation_86 (Activation)      (None, None, None, 3 0           batch_normalization_86[0][0]     
__________________________________________________________________________________________________
mixed9_1 (Concatenate)          (None, None, None, 7 0           activation_88[0][0]              
                                                                 activation_89[0][0]              
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, None, None, 7 0           activation_92[0][0]              
                                                                 activation_93[0][0]              
__________________________________________________________________________________________________
activation_94 (Activation)      (None, None, None, 1 0           batch_normalization_94[0][0]     
__________________________________________________________________________________________________
mixed10 (Concatenate)           (None, None, None, 2 0           activation_86[0][0]              
                                                                 mixed9_1[0][0]                   
                                                                 concatenate_2[0][0]              
                                                                 activation_94[0][0]              
__________________________________________________________________________________________________
global_average_pooling2d_1 (Glo (None, 2048)         0           mixed10[0][0]                    
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 1024)         2098176     global_average_pooling2d_1[0][0] 
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 200)          205000      dense_1[0][0]                    
==================================================================================================
Total params: 24,105,960
Trainable params: 2,303,176
Non-trainable params: 21,802,784
__________________________________________________________________________________________________

你的迁移学习模型结构如果是这样,就说明有问题了。

将上面的代码修改成这样就可以了:

from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D, Input
from keras import backend as K

# create the base pre-trained model
Inp = Input((224, 224, 3))
base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(224,224,3))
x = base_model(Inp)
# add a global spatial average pooling layer
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=Inp, outputs=predictions)

# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
    layer.trainable = False

# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# train the model on the new data for a few epochs
model.fit_generator(...)

运行下model.summary()再看一下模型结构:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
inception_v3 (Model)         (None, 5, 5, 2048)        21802784  
_________________________________________________________________
global_average_pooling2d_2 ( (None, 2048)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 1024)              2098176   
_________________________________________________________________
dense_4 (Dense)              (None, 200)               205000    
=================================================================
Total params: 24,105,960
Trainable params: 2,303,176
Non-trainable params: 21,802,784
_________________________________________________________________

看一下正确的结果:

Epoch 1/20
1500/1500 [==============================] - 27s 18ms/step - loss: 2.4664 - binary_accuracy: 0.9125 - f1_keras: 0.0521 - val_loss: 1.4697 - val_binary_accuracy: 0.9456 - val_f1_keras: 0.0619
Epoch 2/20
1500/1500 [==============================] - 19s 13ms/step - loss: 1.2806 - binary_accuracy: 0.9467 - f1_keras: 0.0795 - val_loss: 1.2819 - val_binary_accuracy: 0.9466 - val_f1_keras: 0.0839
Epoch 3/20
1500/1500 [==============================] - 19s 13ms/step - loss: 1.0431 - binary_accuracy: 0.9526 - f1_keras: 0.1203 - val_loss: 1.3012 - val_binary_accuracy: 0.9468 - val_f1_keras: 0.0908
Epoch 4/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.9168 - binary_accuracy: 0.9555 - f1_keras: 0.1493 - val_loss: 1.3257 - val_binary_accuracy: 0.9445 - val_f1_keras: 0.0922
Epoch 5/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.8281 - binary_accuracy: 0.9577 - f1_keras: 0.1959 - val_loss: 1.3123 - val_binary_accuracy: 0.9468 - val_f1_keras: 0.0969

可以看出验证集的准确率正常了。细心的同学会发现验证集的f1分数与训练集还是有差距的,这是因为我为了测试模型所以只用了1500个样本训练,过拟合也很正常。

如果想解冻base_model的后N层,可以先运行下面代码,看看一共有多少层并且都是哪些层:

for i, layer in enumerate(base_model.layers):
   print(i, layer.name)

再根据需求解冻后N层

for layer in model.layers[:-N]:
   layer.trainable = False
for layer in model.layers[-N:]:
   layer.trainable = True

解决了问题的同学,留个赞再走呀? 

参考资料:https://github.com/keras-team/keras/pull/9965#discussion_r187806860

### 回答1: 非常感谢您的提问。关于使用Python基于迁移学习训练一个模型问题,我可以回答。 使用Python进行迁移学习训练一个模型的步骤如下: 1. 选择一个预训练的模型,如VGG、ResNetInception等。 2. 用预训练的模型作为特征提取器,提取输入数据集的特征。 3. 将提取的特征输入到一个新的全连接层中,用于分类或回归。 4. 对新的全连接层进行训练,更新权重参数。 5. 对整个模型进行微调,包括预训练模型的权重和新的全连接层的权重。 6. 用测试数据集对模型进行评估,调整模型的超参数和训练参数,直到达到最佳性能。 以上是使用Python基于迁移学习训练一个模型的基本步骤。具体实现过程中,需要根据具体问题和数据集进行调整和优化。希望对您有所帮助。 ### 回答2: 使用Python利用迁移学习训练一个模型可以通过以下步骤进行: 1. 导入所需的Python库,如TensorFlow和Keras等。这些库提供了训练和构建模型所需的功能和工具。 2. 下载预训练模型权重。预训练模型通常是在大型数据集上进行训练后得到的,具有良好的特征提取能力。可以从TensorFlow和Keras的官方网站下载这些模型的权重。 3. 创建模型。使用Keras或TensorFlow等库创建一个模型。可以选择使用预训练模型的全部网络结构,也可以根据需要对其进行调整。 4. 设置迁移学习的方式。迁移学习可以通过冻结预训练模型的一部分或全部层来进行。冻结的层不会在训练过程中更新权重,而是保持原有的特征提取能力。可以根据任务需求选择合适的层进行冻结。 5. 设置自定义的输出层。根据要解决的具体问题,添加适当的自定义输出层。输出层的结构和神经元数量通常根据数据集和任务类型进行调整。 6. 编译和训练模型。编译模型需要设置损失函数、优化器和评估指标等。然后,使用数据集对模型进行训练。可以根据需要设置训练的批次大小、迭代次数和学习率等参数。 7. 进行模型评估和预测。使用测试集对训练好的模型进行评估,计算模型准确率、损失值等指标。然后,使用模型进行预测,得出对新样本的分类结果。 8. 进行模型微调(可选)。根据实际情况,可以对模型进行微调,以进一步提高模型性能。可以解冻一些层进行训练,并根据需要进行调整。 9. 保存模型。将训练好的模型保存到硬盘上,以便在需要时进行加载和使用。 使用Python进行迁移学习训练模型可以简化模型构建的过程,并节省大量的训练时间。通过利用预训练模型的特征提取能力,可以在小规模数据集上实现高效的训练和预测。同时,Python提供了丰富的工具和库,使得迁移学习训练模型的过程更加方便和灵活。 ### 回答3: 基于迁移学习使用Python训练模型可以大大加快模型训练的速度和提高模型的准确性。迁移学习是指将已经在大规模数据集上训练好的深度学习模型的参数、网络架构等迁移到一个新的任务上进行训练。 首先,在Python中使用深度学习框架(如TensorFlow、PyTorch等)加载预训练好的模型。这些模型通常是在大规模数据集上进行训练得到的,如ImageNet数据集。可以使用框架提供的函数加载模型的参数,并创建一个新的模型结构。 接下来,冻结预训练模型的参数,即将这些参数设置为不可训练。这样是因为预训练模型已经在大规模数据集上训练得到了较好的特征提取能力,我们只需要在新的任务上微调这些特征。 然后,在新的任务上构建新的模型结构,一般需要去掉原模型的输出层,并添加新的层来适应新的任务。根据新任务的要求,可以选择添加全连接层、卷积层或其他类型的层。 在构建新的模型结构后,使用Python编写代码进行模型训练。这包括指定损失函数、优化算法、学习率等超参数,并使用新的数据集进行训练。可以根据需要调整超参数,使用训练集验证集来监控模型的性能,并进行适当的调整。 最后,使用训练好的模型在测试集或实际应用中进行评估。可以通过计算准确率、召回率、F1得分等指标来评估模型的性能。 总之,通过使用Python进行迁移学习,我们可以充分利用已有的预训练模型,快速训练一个适应新任务的模型。这种方法不仅可以节省数据集和计算资源的成本,还可以提高模型的准确性和效率。
评论 44
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值