chatGLM-6B部署报错quantization_kernels_parallel.so‘ (or one of its dependencies). Try using the full pat

FL1768317420

于 2024-03-30 07:17:10 发布

阅读量105

点赞数

分类专栏： python 文章标签：开发语言

原文链接：https://blog.csdn.net/FL1623863129/article/details/136838987?spm=1001.2014.3001.5502

版权

python 专栏收录该内容

60 篇文章 0 订阅

订阅专栏

用python部署chatglm2时候报错：

FileNotFoundError: Could not find module 'C:\Users\Administrator\.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.

问题分析和解决方法：

chatGLM 加载模型时会自动编译两个c文件并加载：quantization_kernels_parallel.c 和 quantization_kernels.c

这两个文件是以base64存在模型文件夹的 quantization.py 中的，加载模型时会解码写出文件。

这里 quantization_kernels_parallel.c 的编译成功，但加载时报错：

FileNotFoundError: Could not find module '[略]\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.

但是并没有影响，因为它会再尝试编译和加载 quantization_kernels ，这次是成功的，他们的目的是一样的，所以有一个成功，就能启动。

不过还是找了一下 quantization_kernels_parallel 加载失败的原因：

https://github.com/THUDM/ChatGLM-6B/issues/967

这是因为ctypes在Windows环境下的bug还没修复，python3.10目前还有问题。

需要对 [模型文件夹]\quantization.py 中的 ctypes.cdll.LoadLibrary相关代码进行处理：

# 搜索这句
kernels = ctypes.cdll.LoadLibrary(kernel_file)

# 替换为这句，也可以只是注掉上面那句
#kernels = ctypes.cdll.LoadLibrary(kernel_file)
kernels = ctypes.CDLL(kernel_file, winmode = 0)

特别注意：ctypes.cdll.LoadLibrary和ctypes.CDLL是不一样函数，不是ctypes.cdll.LoadLibrary(kernel_file,winmode = 0)

也有人会手动编译，步骤如下，也可以尝试
(1)手动编译，在模型path下

gcc -fPIC -pthread -fopenmp -std=c99 quantization_kernels_parallel.c -shared -o quantization_kernels_parallel.so
gcc -fPIC -pthread -fopenmp -std=c99 quantization_kernels.c -shared -o quantization_kernels.so
（2）然后在原先模型加载后手动加载一下手动编译的kernel

model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4",trust_remote_code=True).float()
model = model.quantize(bits=4, kernel_file="Your Kernel Path")

FL1768317420

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
chatGLM-6B部署报错quantization_kernels_parallel.so‘ (or one of its dependencies). Try using the full pat

特别注意：ctypes.cdll.LoadLibrary和ctypes.CDLL是不一样函数，不是ctypes.cdll.LoadLibrary(kernel_file,winmode = 0)chatGLM 加载模型时会自动编译两个c文件并加载：quantization_kernels_parallel.c 和 quantization_kernels.c。但是并没有影响，因为它会再尝试编译和加载 quantization_kernels ，这次是成功的，他们的目的是一样的，所以有一个成功，就能启动。
复制链接

扫一扫