Fine-tuning a model from an existing checkpoint Rather than training from scratch, we'll often want

最新推荐文章于 2024-01-08 18:53:15 发布

hellboge

最新推荐文章于 2024-01-08 18:53:15 发布

阅读量494

点赞数

分类专栏：人工智能文章标签： tensorflow

人工智能专栏收录该内容

2 篇文章 0 订阅

订阅专栏

Fine-tuning a model from an existing checkpoint

Rather than training from scratch, we'll often want to start from a pre-trainedmodel and fine-tune it.To indicate a checkpoint from which to fine-tune, we'll call training withthe --checkpoint_path flag and assign it an absolute path to a checkpointfile.

When fine-tuning a model, we need to be careful about restoring checkpointweights. In particular, when we fine-tune a model on a new task with a differentnumber of output labels, we wont be able restore the final logits (classifier)layer. For this, we'll use the --checkpoint_exclude_scopes flag. This flaghinders certain variables from being loaded. When fine-tuning on aclassification task using a different number of classes than the trained model,the new model will have a final 'logits' layer whose dimensions differ from thepre-trained model. For example, if fine-tuning an ImageNet-trained model onFlowers, the pre-trained logits layer will have dimensions [2048 x 1001] butour new logits layer will have dimensions [2048 x 5]. Consequently, thisflag indicates to TF-Slim to avoid loading these weights from the checkpoint.

Keep in mind that warm-starting from a checkpoint affects the model's weightsonly during the initialization of the model. Once a model has started training,a new checkpoint will be created in${TRAIN_DIR}. If the fine-tuningtraining is stopped and restarted, this new checkpoint will be the one fromwhich weights are restored and not the ${checkpoint_path}$ . Consequently,the flags--checkpoint_path and --checkpoint_exclude_scopes are only usedduring the 0-th global step (model initialization). Typically for fine-tuningone only want train a sub-set of layers, so the flag--trainable_scopes allowsto specify which subsets of layers should trained, the rest would remain frozen.

Below we give an example offine-tuning inception-v3 on flowers,inception_v3 was trained on ImageNet with 1000 class labels, but the flowersdataset only have 5 classes. Since the dataset is quite small we will only trainthe new layers.

$ DATASET_DIR=/tmp/flowers
$ TRAIN_DIR=/tmp/flowers-models/inception_v3
$ CHECKPOINT_PATH=/tmp/my_checkpoints/inception_v3.ckpt
$ python train_image_classifier.py \
    --train_dir=${TRAIN_DIR} \
    --dataset_dir=${DATASET_DIR} \
    --dataset_name=flowers \
    --dataset_split_name=train \
    --model_name=inception_v3 \
    --checkpoint_path=${CHECKPOINT_PATH} \
    --checkpoint_exclude_scopes=InceptionV3/Logits,InceptionV3/AuxLogits \
    --trainable_scopes=InceptionV3/Logits,InceptionV3/AuxLogits

Evaluating performance of a model

To evaluate the performance of a model (whether pretrained or your own),you can use the eval_image_classifier.py script, as shown below.

Below we give an example of downloading the pretrained inception model andevaluating it on the imagenet dataset.

CHECKPOINT_FILE = ${CHECKPOINT_DIR}/inception_v3.ckpt  # Example
$ python eval_image_classifier.py \
    --alsologtostderr \
    --checkpoint_path=${CHECKPOINT_FILE} \
    --dataset_dir=${DATASET_DIR} \
    --dataset_name=imagenet \
    --dataset_split_name=validation \
    --model_name=inception_v3

See the evaluation module examplefor an example of how to evaluate a model at multiple checkpoints during or after the training.

Exporting the Inference Graph

Saves out a GraphDef containing the architecture of the model.

To use it with a model name defined by slim, run:

$ python export_inference_graph.py \
  --alsologtostderr \
  --model_name=inception_v3 \
  --output_file=/tmp/inception_v3_inf_graph.pb

$ python export_inference_graph.py \
  --alsologtostderr \
  --model_name=mobilenet_v1 \
  --image_size=224 \
  --output_file=/tmp/mobilenet_v1_224.pb

Freezing the exported Graph

If you then want to use the resulting model with your own or pretrainedcheckpoints as part of a mobile model, you can run freeze_graph to get a graphdef with the variables inlined as constants using:

bazel build tensorflow/python/tools:freeze_graph

bazel-bin/tensorflow/python/tools/freeze_graph \
  --input_graph=/tmp/inception_v3_inf_graph.pb \
  --input_checkpoint=/tmp/checkpoints/inception_v3.ckpt \
  --input_binary=true --output_graph=/tmp/frozen_inception_v3.pb \
  --output_node_names=InceptionV3/Predictions/Reshape_1

The output node names will vary depending on the model, but you can inspect andestimate them using the summarize_graph tool:

bazel build tensorflow/tools/graph_transforms:summarize_graph

bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
  --in_graph=/tmp/inception_v3_inf_graph.pb

Run label image in C++

To run the resulting graph in C++, you can look at the label_image sample code:

bazel build tensorflow/examples/label_image:label_image

bazel-bin/tensorflow/examples/label_image/label_image \
  --image=${HOME}/Pictures/flowers.jpg \
  --input_layer=input \
  --output_layer=InceptionV3/Predictions/Reshape_1 \
  --graph=/tmp/frozen_inception_v3.pb \
  --labels=/tmp/imagenet_slim_labels.txt \
  --input_mean=0 \
  --input_std=255