1.系统环境
硬件环境(Ascend/GPU/CPU): Ascend
执行模式:静态图
Python版本:3.7
操作系统平台:Linux
2. 报错信息
2.1 问题描述
MindSpore跑模型并行,随机初始化模型能够正常跑通,加载预训练模型报numpy的错误,但是根据调用栈可以看到是参数切分部分调过去。报错:
ValueError: array split does not result in an equal division
复制
报错信息:
Traceback (most recent call last):
File "/opt/huawei/schedule-train/algorithm/*/main.py", line 701, in <module>
main(config_)
File "/opt/huawei/schedule-train/algorithm/*/main.py", line 673, in main
train_prompt, train_actor_logprobs, train_sft_logprobs, train_critic_r, train_reward_r)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/mindspore/nn/cell.py", line 636, in __call__
out = self.compile_and_run(*args, **kwargs)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/mindspore/nn/cell