记录使用fastllm转换chatGLM并部署

最新推荐文章于 2024-07-06 00:06:29 发布

ZZZZyh00000

最新推荐文章于 2024-07-06 00:06:29 发布

阅读量650

点赞数 11

分类专栏： LLM 文章标签：人工智能 python

本文链接：https://blog.csdn.net/ZZZZyh00000/article/details/135019885

版权

LLM 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

使用fastllm转换chatGLM并部署

发这个文章记录我的使用过程。。。俺只是使用，记录一下，怕以后忘记~~~~

1. 下载 fastllm

github地址：

https://github.com/ztxz16/fastllm

2. 安装fastllm

下面是官方md中的安装方法：

cd fastllm
mkdir build
cd build
cmake .. -DUSE_CUDA=ON # 如果不使用GPU编译，那么使用 cmake .. -DUSE_CUDA=OFF
make -j
cd tools && python setup.py install

3. 遇到的问题：

我在 cmake这一步持续报错，有两个：

执行 cmake … -DUSE_CUDA=ON遇到了如下错误：

-- The CXX compiler identification is unknown
CMake Error at CMakeLists.txt:3 (project):
No CMAKE_CXX_COMPILER could be found.

Tell CMake where to find the compiler by setting either the environment
variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
to the compiler, or to the compiler name if it is in the PATH.

这个错误是因为Cmake没有找到C++编译器，需要你安装

对于 Debian/Ubuntu 系统：
```
sudo apt-get install g++
```
对于 CentOS/RHEL 系统：
```
sudo yum install gcc-c++
```

解决上面的问题后，又频繁报错：

c++: 错误：unrecognized command line option ‘--std=c++17’ ：

这是因为之前下载的c++编译器版本太旧引起的，需要更新。

我用的服务器系统是CentOS7，需要使用如下指令（Debian/Ubuntu 系统可以自行更新一下，这里我没验证过，就不列举了）：

sudo yum install centos-release-scl
sudo yum install devtoolset-9-gcc-c++
scl enable devtoolset-9 bash
#验证版本
g++ --version

到这步就可以重新安装了。

但是我好像就执行到make -j，没有执行后面的那一条

4. 转换模型

上述指令如果都执行成功以后，在你新建的build文件夹下会生成很多文件。需要找到chatglm_export.py文件，修改里面的模型位置，替换成你自己本地的。

脚本地址：

your_path_to/build/tools/chatglm_export.py

脚本的内容

import sys
from transformers import AutoTokenizer, AutoModel
from fastllm_pytools import torch2flm

if __name__ == "__main__":
    tokenizer = AutoTokenizer.from_pretrained("/your_path_to/chatglm3-6b", trust_remote_code=True)#替换
    model = AutoModel.from_pretrained("/your_path_to/chatglm3-6b", trust_remote_code=True)#替换
    model = model.eval()

    dtype = sys.argv[2] if len(sys.argv) >= 3 else "float16"
    exportPath = sys.argv[1] if len(sys.argv) >= 2 else "chatglm-6b-" + dtype + ".flm" ##这个地方我修改了引号，官方的好像有点问题
    torch2flm.tofile(exportPath, model, tokenizer, dtype = dtype)

然后使用python命令执行就好，执行成功后当前位置会生成一个.flm文件，就是转换的文件

然后可以使用官方给的方式执行（我验证了前两种）：

# 命令行聊天程序, 支持打字机效果
./main -p model.flm 

# 简易webui, 使用流式输出 + 动态batch，可多路并发访问
./webui -p model.flm --port 1234 

# python版本的命令行聊天程序，使用了模型创建以及流式对话效果
python tools/cli_demo.py -p model.flm 

# python版本的简易webui，需要先安装streamlit-chat
streamlit run tools/web_demo.py model.flm