tensorflow estimator api train时的 checkpoint save 行为 和 val时的chekpoint skip行为

本文介绍了使用TensorFlow Estimator API进行训练时的checkpoint保存行为,以及在验证阶段如何跳过checkpoint的详细过程。通过`experiment.train_and_evaluate()`,在训练部分,`experiment.train()`会调用`estimator._train_model()`并利用CheckpointSaverHook来定期保存模型。在验证部分,具体策略如`SecondOrStepTimer.should_trigger_for_step`被用于决定是否保存验证时的checkpoint。
摘要由CSDN通过智能技术生成
INFO:tensorflow:Create CheckpointSaverHook.
2018-01-15 16:24:33.513942: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-01-15 16:24:34.390763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:89:00.0
totalMemory: 10.91GiB freeMemory: 10.75GiB
2018-01-15 16:24:34.390813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:89:00.0, compute capability: 6.1)
2018-01-15 16:25:58.010092: I tensorflow/core/kernels/shuffle_dataset_op.cc:110] Filling up shuffle buffer (this may take a while): 499 of 1000
2018-01-15 16:26:07.689469: I tensorflow/core/kernels/shuffle_dataset_op.cc:121] Shuffle buffer filled.
INFO:tensorflow:Saving checkpoints for 1 into /train/mymodels/model.ckpt.
INFO:tensorflow:loss = 22.2663, step = 1
......
EBUG:tensorflow:Skipping evaluation due to same checkpoint /train/mymodels/model.ckpt-1 for step 100 as for step 50.

执行流程如下:

experiment.train_and_evaluate()

# 验证部分用hook实现, 
if self._min_eval_frequency:
   self._train_monitors += [
       monitors.ValidationMonitor(
           input_fn=self._eval_input_fn,
           eval_steps=self._eval_steps,
           metrics=self._eval_metrics,
           every_n_steps=self._min_eval_frequency,
           name=eval_dir_suffix,
           hooks=self._eval_hooks)
   ]

# 训练部分最终调用estimator._train_model(), 第一次训练会保存一下快照!!!
self.train(delay_secs=0)

训练部分

experiment.train(delay_secs=0) -> experiment._estimator.train-> estimator._train_model()

#estimator._train_model()代码
# ...
      # 1. 增加loss监控 (通过hooks&#x
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值