环境配置
- paddle 1.6
- CUDA 10.0
- cuDNN 7.6
- python 3.5.6
- windows10
运行过程
ERNIE的git仓库地址:ERNIE仓库
下载好了预训练模型和数据,放到项目中。按照GitHub上的说明,修改script/zh_task/ernie_base/run_msra_ner.sh中模型地址和数据地址。(不太清楚为啥dev_set有两个值)
然后直接sh ./script/zh_task/ernie_base/run_msra_ner.sh
因为要在pycharm中运行,我就把参数写到了run_sequence_labeling.py文件中。
def set_args(args):
MODEL_PATH = "./ERNIE_1.0/"
TASK_DATA_PATH = "./data/task_data/"
args.use_cuda = True
args.do_train = True
args.do_val = True
args.do_test = True
args.batch_size = 16
args.init_pretraining_params = os.path.join(MODEL_PATH, "params")
args.num_labels = 7
args.chunk_scheme = "IOB"
args.label_map_config = os.path.join(TASK_DATA_PATH, "msra_ner/label_map.json")
args.train_set = os.path.join(TASK_DATA_PATH, "msra_ner/train.tsv")
args.dev_set = os.path.join(TASK_DATA_PATH, "msra_ner/dev.tsv")
args.test_set = os.path.join(TASK_DATA_PATH, "msra_ner/test.tsv")
args.vocab_path = os.path.join(MODEL_PATH, "vocab.txt")
args.ernie_config_path = os.path.join(MODEL_PATH, "ernie_config.json")
args.checkpoints = "./checkpoints"
args.save_steps = 100000
args.weight_decay = 0.01
args.warmup_proportion = 0.0
args.validation_steps = 100
args.use_fp16 = False
args.epoch = 6
args.max_seq_len = 256
args.learning_rate = 5e-5
args.skip_steps = 10
args.num_iteration_per_drop_scope = 1
args.random_seed = 1
return args
if __name__ == '__main__':
prepare_logger(log)
args = set_args(args)
print_arguments(args)
check_cuda(args.use_cuda)
main(args)
然后直接运行就报错了:
2020-02-27 12:41:08,592-WARNING: paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
[WARNING] 2020-02-27 12:41:08,592 [ io.py: 707]: paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
Traceback (most recent call last):
File "E:/PyProject/ERNIE-release-r2.1.0/run_sequence_labeling.py", line 385, in <module>
main(args)
File "E:/PyProject/ERNIE-release-r2.1.0/run_sequence_labeling.py", line 106, in main
ernie_config=ernie_config)
File "E:\PyProject\ERNIE-release-r2.1.0\finetune\sequence_label.py", line 79, in create_model
lod_labels = fluid.layers.sequence_unpad(labels, seq_lens)
File "D:\Anaconda3\envs\python35\lib\site-packages\paddle\fluid\layers\sequence_lod.py", line 1013, in sequence_unpad
outputs={'Out': out})
File "D:\Anaconda3\envs\python35\lib\site-packages\paddle\fluid\layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "D:\Anaconda3\envs\python35\lib\site-packages\paddle\fluid\framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "D:\Anaconda3\envs\python35\lib\site-packages\paddle\fluid\framework.py", line 1880, in __init__
self.desc.infer_shape(self.block.desc)
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
Windows not support stack backtrace yet.
------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
File "D:\Anaconda3\envs\python35\lib\site-packages\paddle\fluid\framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "D:\Anaconda3\envs\python35\lib\site-packages\paddle\fluid\layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "D:\Anaconda3\envs\python35\lib\site-packages\paddle\fluid\layers\sequence_lod.py", line 1013, in sequence_unpad
outputs={'Out': out})
File "E:\PyProject\ERNIE-release-r2.1.0\finetune\sequence_label.py", line 79, in create_model
lod_labels = fluid.layers.sequence_unpad(labels, seq_lens)
File "E:/PyProject/ERNIE-release-r2.1.0/run_sequence_labeling.py", line 106, in main
ernie_config=ernie_config)
File "E:/PyProject/ERNIE-release-r2.1.0/run_sequence_labeling.py", line 385, in <module>
main(args)
----------------------
Error Message Summary:
----------------------
Error: The shape of Input(Length) should be [batch_size].
[Hint: Expected len_dims.size() == 1, but received len_dims.size():2 != 1:1.] at (D:\1.7.0\paddle\paddle\fluid\operators\sequence_ops\sequence_unpad_op.cc:41)
[operator < sequence_unpad > error]
网上没有查到相关的信息,后来加了ERNIE的交流群(群号:760439550),在群里问了这个问题,得到解答是paddle版本对sequence_unpad这个op不兼容的问题。用paddle1.5.2就行,不能使用paddle1.6。
第二天安装的时候发现有了paddlepaddle 1.7,就先安了一下1.7试了试,同样存在该问题。
然后就果断安装了paddlepaddle 1.5.2,然后这个问题得到了解决。