Deep learning models can take hours, days or even weeks to train and if a training run is stopped unexpectedly, you can lose a lot of work. In this lesson you will discover how you can checkpoint your deep learning models during training in Python using the Keras library. After completing this lesson you will know:
- The importance of checkpointing neural network models when training.
- How to checkpoint each improvement to a model during training.
- How to checkpoint the very best model observed during training.
1.1 Checkingpointing Neural Network Models
Application checkpointing is a fault tolerance technique for long running processes. It is an approach where a snapshot of the state of the system is taken in case of system failure. If there is a problem, not all is lost. The checkpoint may be used directly, or used as the starting point for a new run, picking up where it left off↵. When training deep learning models, the checkpoint captures the weights of the model. These weights can be used to make predictions as-is, or used as the basis for ongoing training.
The Keras library provides a checkpointing capability by a callback API. The ModelCheckpoint callback class allows you to define where to checkpoint the model weights, how the file should be named and under what circumstances to make a checkpoint of the model. The API allows you to specify which metric to monitor, such as loss or accuracy on the training or validation dataset. You can specify whether to look for an improvement in maximizing or minimizing the score. Finally, the filename that you use to store the weights can include variables like the epoch number or metric. The ModelCheckpoint instance can then be passed to the training process when calling the fit() function on the model. Note, you may need to install the h5py library
1.2 Checkpoint Neural Network Model Improvements
A good use of checkpointing is to output the model weights each time an improvement is observed during training.
Checkpointing is setup to save the network weights only when there is an improvement in classification accuracy on the validation dataset (monitor=’val acc’ and mode=’max’). The weights are stored in a file that includes the score in the filename weights-improvement-val acc=.2f.hdf5.
# Checkpoint Model Improvements
# Checkpoint the weights when validation accuracy improves
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import ModelCheckpoint
import matplotlib.pyplot as plt
import numpy as np
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# load pima indians dataset
dataset = np.loadtxt("pima-indians-diabetes.csv",delimiter=",")
# split into input(X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
# create model
model = Sequential()
model.add(Dense(12, input_dim=8,kernel_initializer='normal',activation='relu'))
model.add(Dense(8, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal',activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
# checkpoint
filepath="weights-improvement-{epoch:02d}-{val_accuracy:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy',verbose=1,save_best_only=True,mode='max')
callbacks_list = [checkpoint]
#Fit the model
model.fit(X, Y, validation_split=0.33, epochs=150, batch_size=10,callbacks=callbacks_list, verbose=0)
Running the example produces the output below, truncated for brevity. In the output you can see cases where an improvement in the model accuracy on the validation dataset resulted in a new weight file being written to disk.

You will also see a number of files in your working directory containing the network weights in HDF5 format. For example:

This is a very simple checkpointing strategy. It may create a lot of unnecessary checkpoint files if the validation accuracy moves up and down over training epochs. Nevertheless, it will ensure that you have a snapshot of the best model discovered during your run.
1.3 Checkpoint Best Neural Network Model Only
A simpler checkpoint strategy is to save the model weights to the same file, if and only if the validation accuracy improves. This can be done easily using the same code from above and changing the output filename to be fixed (not include score or epoch information). In this case, model weights are written to the file weights.best.hdf5 only if the classification accuracy of the model on the validation dataset improves over the best seen so far.
# Checkpoint Best Model Only
# Checkpoint thw weights for best model on validation accuracy
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import ModelCheckpoint
import matplotlib.pyplot as plt
import numpy as np
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# load pima indians dataset
dataset = np.loadtxt("pima-indians-diabetes.csv",delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
# create model
model = Sequential()
model.add(Dense(12,input_dim=8, kernel_initializer='uniform',activation='relu'))
model.add(Dense(8,kernel_initializer='uniform',activation='relu'))
model.add(Dense(1,kernel_initializer='uniform',activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
# checkpoint
filepath = "weights.best.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy',verbose=1,save_best_only=True,mode='max')
callbacks_list = [checkpoint]
# Fit the model
model.fit(X, Y, validation_split=0.33,epochs=150,batch_size=10,callbacks=callbacks_list, verbose=0)

You should see the weight file in your local directory.

1.4 Loading a Saved Neural Network Model
Now that you have seen how to checkpoint your deep learning models during training, you need to review how to load and use a checkpointed model. The checkpoint only includes the model weights. It assumes you know the network structure. This too can be serialize to file in JSON or YAML format. In the example below, the model structure is known and the best weights are loaded from the previous experiment, stored in the working directory in the weights.best.hdf5 file. The model is then used to make predictions on the entire dataset.
# Load and Evaluate a Model Checkpoint
# How to load and use weights from a checkpoint
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import ModelCheckpoint
import matplotlib.pyplot as plt
import numpy as np
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# create model
model = Sequential()
model.add(Dense(12,input_dim=8, kernel_initializer='uniform',activation='relu'))
model.add(Dense(8, kernel_initializer='uniform',activation='relu'))
model.add(Dense(1,kernel_initializer='uniform',activation='sigmoid'))
# load weights
model.load_weights("weights.best.hdf5")
# Compile model (required to make predications)
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
print("Created model and loaded weights from file")
# load pima indians dataset
dataset = np.loadtxt("pima-indians-diabetes.csv",delimiter=',')
# split into input (X) and output(Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
# estimate accuracy on whole dataset using loaded weights
scores = model.evaluate(X, Y, verbose=0)
print("%s: %.2f%%" % (model.metrics_names[1],scores[1]*100))
Running the example produces the following output:
Created model and loaded weights from file accuracy: 72.66%# Sample Output From Loading and Evaluating a Model Checkpoint
1.5 Summary
In this lesson you have discovered the importance of checkpointing deep learning models for long training runs. You learned:
- How to use Keras to checkpoint each time an improvement to the model is observed.
- How to only checkpoint the very best model observed during training.
- How to load a checkpointed model from file and use it later to make predictions.
1.5.1 Next
You now know how to checkpoint your deep learning models in Keras during long training schemes. In the next lesson you will discover how to collect, inspect and plot metrics collected about your model during training.
本文介绍了如何使用Keras在训练深度学习模型时进行检查点保存,确保模型的进步得以保留。通过ModelCheckpoint回调,可以在验证集准确率提升时保存模型权重,或者仅保存观察到的最佳模型。这有助于防止长时间训练的损失,并便于后期加载和使用最佳模型进行预测。

1525

被折叠的 条评论
为什么被折叠?



