解决安装mindspore报错：CANN Common Error Analysis的问题

skywalk8163

已于 2024-09-12 17:22:58 修改

阅读量232

点赞数 8

分类专栏：人工智能软硬件调试文章标签： python 深度学习开发语言

于 2024-09-12 10:34:47 首次发布

本文链接：https://blog.csdn.net/skywalk8163/article/details/142166029

版权

软硬件调试同时被 2 个专栏收录

209 篇文章 0 订阅

订阅专栏

人工智能

114 篇文章 4 订阅

订阅专栏

在安装mindformers，升级mindspore的时候，碰到CANN Common Error Analysis报错，寻求解决方法！

问题见：启智社区openi调试环境安装python3.10环境（未完成）-CSDN博客

故障重现

安装python3.10

conda create --name py310 python=3.10

更新conda

conda update -n base -c conda-forge conda

安装mindspore

conda install mindspore=2.3.1 -c mindspore -c conda-forge

碰到速度慢的问题，换pip安装

pip install https://ms-release.obs.cn-north-4.myhuaweicloud.com/2.3.1/MindSpore/unified/aarch64/mindspore-2.3.1-cp310-cp310-linux_aarch64.whl --trusted-host ms-release.obs.cn-north-4.myhuaweicloud.com -i https://pypi.tuna.tsinghua.edu.cn/simple

测试

python -c "import mindspore as ms; x = ms.Tensor((2,3)); y = x + 1; y"

失败

手工pip安装

export MS_VERSION=2.3.0
pip install https://ms-release.obs.cn-north-4.myhuaweicloud.com/${MS_VERSION}/MindSpore/unified/aarch64/mindspore-${MS_VERSION/-/}-cp310-cp310-linux_aarch64.whl --trusted-host ms-release.obs.cn-north-4.myhuaweicloud.com -i https://pypi.tuna.tsinghua.edu.cn/simple

测试失败

选了个mindspore2.2版本的，升级到2.3试试

继续用python3.9

pip install --upgrade mindspore==2.3.1 -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple

升级mindformers

pip install mindformers -U

测试通过

python -c "import mindspore as ms; x = ms.Tensor((2,3)); y = x + 1; y"

但是后来发现跑在cpu了

print(mindspore.context.get_context("device_target"))  
CPU

ms.run_check()
MindSpore version:  2.3.1
The result of multiplication calculation is correct, MindSpore has been installed on platform [CPU] successfully!

测试语句

import mindspore
from mindformers import AutoConfig, AutoModel, AutoTokenizer
 
# 指定图模式，指定使用训练卡id
mindspore.set_context(mode=0, device_id=0)
 
tokenizer = AutoTokenizer.from_pretrained('glm2_6b')
 
# model的实例化有以下两种方式，选择其中一种进行实例化即可
# 1. 直接根据默认配置实例化
model = AutoModel.from_pretrained('glm2_6b')
# 2. 自定义修改配置后实例化
config = AutoConfig.from_pretrained('glm2_6b')
config.use_past = True                  # 此处修改默认配置，开启增量推理能够加速推理性能
# config.xxx = xxx                      # 根据需求自定义修改其余模型配置
model = AutoModel.from_config(config)   # 从自定义配置项中实例化模型
 
inputs = tokenizer("你好")["input_ids"]
# 首次调用model.generate()进行推理将包含图编译时间，推理性能显示不准确，多次重复调用以获取准确的推理性能
outputs = model.generate(inputs, max_new_tokens=20, do_sample=True, top_k=3)
response = tokenizer.decode(outputs)
print(response)
# ['你好，作为一名人工智能助手，我欢迎您随时向我提问。']

基本算成功了，就是下载速度有点慢

Downloading:  50%|██████████████████████▌                      | 6.26G/12.5G [09:07<09:43, 10.7MB/s]

因为是cpu，失败。

思考怎么用npu

尝试换成mindspore2.3的环境

在此基础上升级

pip install --upgrade mindspore

import mindspore as ms
ms.run_check()

!pip install mindformers

import mindspore
import mindformers

!git clone https://gitee.com/mindspore/mindformers

!cd mindformers && bash scripts/examples/glm3/run_glm3_predict.sh single configs/glm3/predict_glm3_6b.yaml

代码

import mindspore
from mindformers import AutoConfig, AutoModel, AutoTokenizer

# 指定图模式，指定使用训练卡id
mindspore.set_context(mode=0, device_id=0)

tokenizer = AutoTokenizer.from_pretrained('glm3_6b')

# model的实例化有以下两种方式，选择其中一种进行实例化即可
# 1. 直接根据默认配置实例化
model = AutoModel.from_pretrained('glm3_6b')
# 2. 自定义修改配置后实例化
config = AutoConfig.from_pretrained('glm3_6b')
config.use_past = True # 此处修改默认配置，开启增量推理能够加速推理性能
# config.xxx = xxx # 根据需求自定义修改其余模型配置
model = AutoModel.from_config(config) # 从自定义配置项中实例化模型

inputs = tokenizer("你好")["input_ids"]
# 首次调用model.generate()进行推理将包含图编译时间，推理性能显示不准确，多次重复调用以获取准确的推理性能
outputs = model.generate(inputs, max_new_tokens=20, do_sample=True, top_k=3)
response = tokenizer.decode(outputs)
print(response)
# ['你好，作为一名人工智能助手，我欢迎您随时向我提问。']

结论：

python3.10版本下的问题没有解决。不过测试下来，mindspore+mindformer可以分别使用2.3和1.2版本，python可以使用3.9版本。

这样就可以用mindformers啦！

调试

import mindformers报错

---> 27 from mindspore._c_expression import swap_cache
     29 from mindformers import models, MindFormerRegister, MindFormerModuleType
     30 from mindformers import build_context, logger, build_parallel_config, GenerationConfig

ImportError: cannot import name 'swap_cache' from 'mindspore._c_expression' (/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/_c_expression.cpython-39-aarch64-linux-gnu.so)

看到issue说升级mindspore或者把这行from mindspore._c_expression import swap_cache注释掉：glm32k 数据处理调用mindformers报错cannot import name "swap_cache" · Issue #I9T7M1 · MindSpore/mindformers - Gitee.com

也就是这是mindspore和mindformers版本不匹配导致的。

后面还会碰到类似的问题，需要修改这个文件：

/tmp/code/mindformers/mindformers/model_runner.py

把里面的这句from mindspore._c_expression import swap_cache注释掉

执行报错fcntl.LOCK_EX

/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindformers/models/tokenization_utils_base.py:1951, in PreTrainedTokenizerBase._download_using_name(cls, name_or_path)
   1949         logger.info("Download the vocab from the url %s to %s.", url_file, file_path)
   1950         download_with_progress_bar(url_file, file_path)
-> 1951     try_sync_file(file_path)
   1953 config = MindFormerConfig(yaml_file)
   1954 return config, cache_path

File /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindformers/tools/utils.py:323, in try_sync_file(file_name)
    321 if fcntl:
    322     with open(file_name, 'r') as fp:
--> 323         fcntl.flock(fp.fileno(), fcntl.LOCK_EX)

OSError: [Errno 9] Bad file descriptor

不明白什么原因，换用bash script试试

执行script报错Only support 'single' or 'multirole', but got PARALLEL.

!cd mindformers && bash scripts/examples/glm3/run_glm3_predict.sh PARALLEL configs/glm3/predict_glm3_6b.yaml

skywalk8163

关注

8
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
解决安装mindspore报错：CANN Common Error Analysis的问题

在安装mindformers，升级mindspore的时候，碰到CANN Common Error Analysis报错，寻求解决方法！
复制链接

扫一扫

专栏目录