1 系统环境
硬件环境(Ascend/GPU/CPU): Ascend
MindSpore版本: 2.2
执行模式(PyNative/ Graph): 不限
2 报错信息
2.1 问题描述
Mindrecoder 格式转换时报错:
For more details, please refer to the FAQ at https://www.mindspore.cn/docs /en/master/fag/data processing.html
raceback (most recentcall last): File "wizardcoder/run_wizardcoder.py", line 149, in <module> device_id=args.device_id) File "wizardcoder/run_wizardcoder.py", line 81, in main
task.train(train_checkpoint=ckpt, resume=resume)
File "/home/wizardcoder/1_wiza rdcoder-mindformers-916/mindformers/trainer/trainer.py", line 423, in train is_full_config=True, **kwargs) File"/home/wizardcoder/1_wizardcoder-mindformers -916/mindformers/t rainer/causal_language_modeling/causal_language_modeling.py", line 106, in train **kwargs)
File "/home/wizardcoder/1_wizardcoder-mindformers-916/mindformers/trainer/base_trainer.py", line 644, in training_process initial_epoch=config.runne r_config.initial_epoch) File "/root/anaconda3/envs/wizardcoder/lib/python3.7/site-packages /mindspore/train/model.py", line 1066, in train
initial_epoch=initial_epoch) File "/root/anaconda3/envs /wizardcoder/lib/python3.7/site-packages/mindspore/t rain/model.py", line 113, in wrapper func(self, *args, **kwargs) File "/root/anaconda3/envsJwizardcoder/lib/python3.7/site-packages/mindspore/train/model.py" , line 620, in _train
cb_params, sink_size, initial_epoch, valid_infos) File "/root/anaconda3/envs/wizardcoder/lib/python3.7/site-packages /mindspore/train/model.py", line 703, in _train_dataset_sink_process outputs = train_network(*inputs) cner File "/root/anaconda3/envs/wizardcoder/l ib/python3.7/site-packages /mindspore/nn/cell.py", line 637, in _call_ out = self.compile_and_run( *args, **kwargs) File "/root/anaconda3/envs/wizardcoder/lib/python3.7/site-packages /mindspo re/nn/cell.py", line 961, in compile_and_run self.compile(*args, **kwargs) File "/root/anaconda3/envs/wizardcoder/lib/python3.7/site-packages/mindspore/nn/cell.py", line 939, in compile jit_config_dict=self._jit_config_dict, *compile_args, **kwargs)
File "/root/anaconda3/envs/wizardcoder/lib/python3.7/site-pack ages/mindspore/common/api .py", line 1623, in compile result = self._graph_executor.compile(obj, args, kwargs, phase, self._use_vm_mode()) File "/root/anaconda3/envs/wizardcoder/lib/python3.7/site-packages/mindspore/ops/primitive.py", line 647, in infer_ outltrack] = fn(*(xltrack] for x in args)) File "/root/anaconda3/envs/wizardcoder/lib/python3.7/site-packages/mindspore/ops/operations/math_ops .py", line 81, in infer_shape
return get_broadcast_shape(x_shape, y_shape, self.name) File "/root/anaconda3/envs/wizardcode r/lib/python3.7/site-packages/mindspore/ops/_utils/utils .py", line 70, in get_broadcast_shape
raise ValueError(f"For " {prim name}', {arg_name}.shape and {arg_name2}.shape need to”
ValueError: For 'Mul'. x.shape and y.shape need to broadcast. The value of x. shapel-2] or y.shape[-2] must be 1 or -1 when they a are not the same, but got x.shape = [2, 2047,2047] and y.shape = [1. 2048, 2048]
复制
3 根因分析
MindFormers基于MindSpore语言,先将数据tokenizer化后再转换为Mindrecoder格式,
使用Mindrecoder格式的数据来训练模型;transformers则不需要这种特定的数据格式。
4 解决方案
因为mindformers设计的自身特点,在tokenizer时候需要将切分数据长度设置为seq_length + 1,之后再保存为mindrecoder格式,就不会报错。