python-windows本地快速体验ChatGLM2-6B-int预训练模型（和修改python安装包路径）

本文链接：https://blog.csdn.net/m0_60688978/article/details/132529456

python-windows本地快速体验ChatGLM2-6B-int预训练模型（和修改python安装包路径）

先说结论

20分钟出来一句话，期间，内存使用50%左右，cpu使用率85%

电脑配置

电脑环境：4cpu16g

配置PYTHON环境

修改python安装包路径：https://blog.csdn.net/qq_27466827/article/details/131163026
pip3 install torch torchvision torchaudio -i https://mirrors.aliyun.com/pypi/simple/
pip install transformers sentencepiece
pip install rouge_chinese cpm_kernels

配置GCC环境

TDM-GCC 10.3.0 下载https://jmeubank.github.io/tdm-gcc/download/
这里注意，安装时候，要选择openmp

下载模型文件

清华云盘下载

云盘下载：https://cloud.tsinghua.edu.cn/d/674208019e314311ab5c/?p=%2Fchatglm2-6b-int4&mode=list

代码下载

#执行不成功，多执行几次就可以了 
from huggingface_hub import snapshot_download
import sys

repo_id = "THUDM/chatglm2-6b"
local_dir = "/opt/models/chatglm2-6b/"
local_dir_use_symlinks = False
revision = "main"
snapshot_download(repo_id=repo_id, 
                local_dir=local_dir,
                local_dir_use_symlinks=local_dir_use_symlinks,
                revision=revision)

报脸下载

https://hf-mirror.com/THUDM/chatglm2-6b-int4

直接命令行体验

打卡cmd窗口依次输入如下命令，快速体验

python
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("G://glm2-int4", trust_remote_code=True)
model = AutoModel.from_pretrained("G://glm2-int4",trust_remote_code=True).float()
model = model.eval()
response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=[])
print(response)

>>> tokenizer = AutoTokenizer.from_pretrained("D:\\jpdir\\localKnow\\models\\chatglm2-6b-int4\\chatglm2-6b-int4", trust_remote_code=True,revision="v1.1.0")
>>> model = AutoModel.from_pretrained("D:\\jpdir\\localKnow\\models\\chatglm2-6b-int4\\chatglm2-6b-int4", trust_remote_code=True,revision="v1.1.0").float()
>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "你好", history=[])
>>> print(response)
你好👋！我是人工智能助手 ChatGLM2-6B，很高兴见到你，欢迎问我任何问题。
>>> response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)

运行webui体验

修改代码

	tokenizer = AutoTokenizer.from_pretrained("D:\\jpdir\\localKnow\\models\\chatglm2-6b-int4\\chatglm2-6b-int4", trust_remote_code=True,revision="v1.1.0")
	model = AutoModel.from_pretrained("D:\\jpdir\\localKnow\\models\\chatglm2-6b-int4\\chatglm2-6b-int4", trust_remote_code=True,revision="v1.1.0").float()
	model = model.eval()

下载好代码后，有如下两个ui：
web_demo.py
web_demo2.py
我运行第一个报错了
运行第二个没有问题了，运行命令是streamlit run web_demo2.py
在这里插入图片描述

遇到的问题

Windows下CPU部署chatglm-6b-int4报错“Could not find module ‘nvcuda.dll’”等

Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Failed to load cpm_kernels:Could not find module ‘nvcuda.dll’. Try using the full path with constructor syntax.
Load parallel cpu kernel failed C:\Users\xxx.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last):
File “C:\Users\l84196432/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py”, line 148, in init
kernels = ctypes.cdll.LoadLibrary(kernel_file)
File “D:\ProgramData\miniconda3\envs\glm\lib\ctypes_init_.py”, line 447, in LoadLibrary
return self.dlltype(name)
File "D:\ProgramData\miniconda3\envs\glm\lib\ctypes_init.py", line 369, in init
self._handle = _dlopen(self._name, mode)
FileNotFoundError: Could not find module ‘C:\Users\xxx.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so’. Try using the full path with constructor syntax.