这是我kaggle比赛看到大佬们之后总结出来的经验,因技术一般般,总结的不对的地方请给我留言,这也是我宝贵的学习机会
1.首先试试已知的所有网络结构,选出效果最好的两个,开始实验
2.验证集也可以作为训练集的一部分,跳过验证,这样效果可能更好
3.尝试模型融合,把两个性能最好的融合到一起
Finding best alpha
Our final model is just mix of two presented above. In the first commit it was arithmetic mean (alpha = 0.5). Note that using validation data as training will fit your model with accuracy equal 1.0.
Thus formula presented below of linear combination of models will work only with validation data:
prob = alpha prob(model) + (1 - alpha) prob(model2)**
4.在对学习率下手
V8: Pushing the max LR up to 0.0001 * strategy.num_replicas_in_sync (0.96059)
V9: max LR = 0.00003 * strategy.num_replicas_in_sync (0.95955)
V10: max LR = 0.00006 * strategy.num_replicas_in_sync (0.96114)
V11: LR_EXP_DECAY = .5 (from .8) (0.96256)
V12: LR_EXP_DECAY = .9 (0.96056)
V13: LR_RAMPUP_EPOCHS = 3 and LR_EXP_DECAY = .5 (0.96044)
V14: Manually interrupted
V15: LR_RAMPUP_EPOCHS = 5 and LR_EXP_DECAY = .7
这些参数都会影响
def lrfn(epoch):
if epoch < LR_RAMPUP_EPOCHS:
lr = (LR_MAX - LR_START) / LR_RAMPUP_EPOCHS * epoch + LR_START
elif epoch < LR_RAMPUP_EPOCHS + LR_SUSTAIN_EPOCHS:
lr = LR_MAX
else:
lr = (LR_MAX - LR_MIN) * LR_EXP_DECAY**(epoch - LR_RAMPUP_EPOCHS - LR_SUSTAIN_EPOCHS) + LR_MIN
return lr
lr_callback = tf.keras.callbacks.LearningRateScheduler(lrfn, verbose=True)
学习率也可以采取定值的方式衰减
使用adam就可以了
5.图片的大小也会影响,找到最合适的大小