cpu部署chatglm 报错No compiled kernel found.

最新推荐文章于 2024-07-31 15:00:20 发布

像夏天一样热

最新推荐文章于 2024-07-31 15:00:20 发布

阅读量5.8k

点赞数 2

文章标签： python

本文链接：https://blog.csdn.net/qq_37356556/article/details/129941703

版权

文章描述了一个在使用HuggingFace的Transformers库时遇到的编译问题，编译quantizationkernels失败。用户尝试手动编译并加载kernel以解决这个问题，虽然仍有编译错误，但模型似乎可以使用了。建议是手动编译kernel并指定其路径以加载到量化模型中。

摘要由CSDN通过智能技术生成

No compiled kernel found.
Compiling kernels : C:\Users\admin.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.so
d:/mingw/bin/…/lib/gcc/mingw32/6.3.0/…/…/…/…/mingw32/bin/ld.exe: cannot find -lpthread
collect2.exe: error: ld returned 1 exit status
Compile failed, using default cpu kernel code.
Compiling gcc -O3 -fPIC -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\local\quantization_kernels.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\local\quantization_kernels.so
Kernels compiled : C:\Users\admin.cache\huggingface\modules\transformers_modules\local\quantization_kernels.so
Cannot load cpu kernel, don’t use quantized model on cpu.

(1)手动编译，在模型path下

gcc -fPIC -pthread -fopenmp -std=c99 quantization_kernels_parallel.c -shared -o quantization_kernels_parallel.so
gcc -fPIC -pthread -fopenmp -std=c99 quantization_kernels.c -shared -o quantization_kernels.so

（2）然后在原先模型加载后手动加载一下手动编译的kernel


model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4",trust_remote_code=True).float()
model = model.quantize(bits=4, kernel_file="Your Kernel Path")