Tensor2Tensor 踩坑记录

最新推荐文章于 2024-08-08 08:22:14 发布

Davidddl

最新推荐文章于 2024-08-08 08:22:14 发布

阅读量5k

点赞数

分类专栏： python 机器学习文章标签： TensorFlow Tensor2Tensor

本文链接：https://blog.csdn.net/Davidddl/article/details/81709243

版权

机器学习同时被 2 个专栏收录

3 篇文章 0 订阅

订阅专栏

python

2 篇文章 0 订阅

订阅专栏

Tensor2Tensor（Transformer）使用方法

安装环境

安装CUDA 9.0 （一定是9.0，不能是9.2）
安装TensorFlow （现在是1.8）
安装Tensor2Tensor

开始使用

数据预处理：这一步骤是根据自己任务自己编写一些预处理的代码，比如字符串格式化，生成特征向量等操作。
编写自定义problem：
1. 编写自定义的problem代码，一定需要在自定义类名前加装饰器。（@registry.registry_problem）
2. 自定义problem的类名一定是驼峰式命名，py文件名一定是下划线式命名，且与类名对应
3. 一定需要继承父类problem，t2t已经提供用于生成数据的problem，需要自行将自己的问题人脑分类找到对应的父类，主要定义的父类problem有：（运行 t2t-datagen 可以查看到problem list）
4. 一定需要在__init__.py文件里导入自定义problem文件
使用t2t-datagen 将自己预处理后的数据转为t2t的格式化数据集
1. 运行 t2t-datagen --help 或 t2t-datagen --helpfull
2. 如果自定义problem代码的输出格式不正确，则此命令会报错
3. 注意路径
4. Eg. cd scripts && t2t-datagen --t2t_usr_dir=./ --data_dir=../train_data --tmp_dir=../tmp_data --problem=my_problem
使用t2t-trainer使用格式化的数据集进行训练
1. 运行t2t-trainer --help 或 t2t-trainer --helpfull
2. Eg. cd scripts && t2t-trainer --t2t_usr_dir=./ --problem=my_problem --data_dir=../train_data --model=transformer --hparams_set=transformer_base --output_dir=../output --train_steps=20 --eval_steps=100
3. 主要的参数有:
  --eval_steps: Number of steps in evaluation. By default, eval will stop after eval_steps or when it runs through the eval dataset once in full, whichever comes first, so this can be a very large number. (default: '100')
  
  --output_dir: Base output directory for run. (default: '')
  
  --t2t_usr_dir: Path to a Python module that will be imported. The __init__.py file should include the necessary imports. The imported files should contain registrations, e.g. @registry.register_model calls, that will then be available to the t2t-trainer.
  
  --keep_checkpoint_every_n_hours: Number of hours between each checkpoint to be saved. The default value 10,000 hours effectively disables it. (default: '10000') (an integer)
  
  --keep_checkpoint_max: How many recent checkpoints to keep. (default: '20') (an integer)
  
  --local_eval_frequency: Save checkpoints and run evaluation every N steps during local training. (default: '1000') (an integer)
  
  --train_steps: The number of steps to run training for. (default: '250000') (an integer)
  
  --warm_start_from: Warm start from checkpoint.
  
  --worker_gpu: How many GPUs to use. (default: '1') (an integer)
使用t2t-decoder对测试集进行预测
1. 如果想使用某一个checkpoint时的结果时，需要将checkpoint文件中的第一行: model_checkpoint_path: “model.ckpt-xxxx” 的最后的序号修改即可
2. Eg. cd scripts && t2t-decoder --t2t_usr_dir=./ --problem=my_problem --data_dir=../train_data --model=transformer --hparams_set=transformer_base --output_dir=../output --decode_hparams=”beam_size=5,alpha=0.6” --decode_from_file=../decode_in/test_in.txt --decode_to_file=../decode_out/test_out.txt
3. 注意路径
使用t2t-exporter导出训练模型
分析结果