1. 摘要
在linux上搭建Langchain-Chatchat-v0.2.10 + chatglm3-6b + bge-large-zh。
GPU是Tesla P100-PCIE-16GB。
2.准备工作
2.1 创建python3.10虚拟环境
python版本要求:3.8 - 3.11 。
我选择的python版本是3.10。
conda create -n Langchain-Chatchat-v0.2.10 python=3.10
2.2 下载Langchain-Chatchat-v0.2.10
在https://github.com/chatchat-space/Langchain-Chatchat上下载Langchain-Chatchat的v0.2.10版本。
2.3 模型下载
git lfs install
git clone https://huggingface.co/THUDM/chatglm3-6b
git clone https://huggingface.co/BAAI/bge-large-zh
3. 开始搭建
3.1 安装依赖
进入虚拟环境:
source activate Langchain-Chatchat-v0.2.10
进入目录:
cd Langchain-Chatchat-0.2.10
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install -r requirements_api.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install -r requirements_webui.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
3.2 修改配置文件
3.2.1 首先运行以下命令
python copy_config_example.py
3.2.2 修改文件configs/model_config.py
configs/model_config.py文件需要修改的地方有三处:
# 选用的 Embedding 名称
EMBEDDING_MODEL = "bge-large-zh"
MODEL_PATH = {
"embed_model": {
"ernie-tiny": "nghuyong/ernie-3.0-nano-zh",
"ernie-base": "nghuyong/ernie-3.0-base-zh",
"text2vec-base": "shibing624/text2vec-base-chinese",
"text2vec": "GanymedeNil/text2vec-large-chinese",
"text2vec-paraphrase": "shibing624/text2vec-base-chinese-paraphrase",
"text2vec-sentence": "shibing624/text2vec-base-chinese-sentence",
"text2vec-multilingual": "shibing624/text2vec-base-multilingual",
"text2vec-bge-large-chinese": "shibing624/text2vec-bge-large-chinese",
"m3e-small": "moka-ai/m3e-small",
"m3e-base": "moka-ai/m3e-base",
"m3e-large": "moka-ai/m3e-large",
"bge-small-zh": "BAAI/bge-small-zh",
"bge-base-zh": "BAAI/bge-base-zh",
"bge-large-zh": "/data/bge-large-zh",
"bge-large-zh-noinstruct": "BAAI/bge-large-zh-noinstruct",
"bge-base-zh-v1.5": "BAAI/bge-base-zh-v1.5",
"bge-large-zh-v1.5": "BAAI/bge-large-zh-v1.5",
"piccolo-base-zh": "sensenova/piccolo-base-zh",
"piccolo-large-zh": "sensenova/piccolo-large-zh",
"nlp_gte_sentence-embedding_chinese-large": "damo/nlp_gte_sentence-embedding_chinese-large",
"text-embedding-ada-002": "your OPENAI_API_KEY",
},
"llm_model": {
"chatglm2-6b": "THUDM/chatglm2-6b",
"chatglm2-6b-32k": "THUDM/chatglm2-6b-32k",
"chatglm3-6b": "/data/chatglm3-6b",
"chatglm3-6b-32k": "THUDM/chatglm3-6b-32k",
"Orion-14B-Chat": "OrionStarAI/Orion-14B-Chat",
"Orion-14B-Chat-Plugin": "OrionStarAI/Orion-14B-Chat-Plugin",
"Orion-14B-LongChat": "OrionStarAI/Orion-14B-LongChat",
3.3 初始化知识库
python init_database.py --recreate-vs
3.4 按照以下命令启动项目
python startup.py –a
启动成功:
GPU占用12G左右:
4 . web ui 启动界面示例
http://ip:8501
ip是搭建langchain-chatchat的服务器ip地址。
4.1 对话页面
对话模式选择LLM对话
4.2 知识库管理页面
4.3 往知识库中添加文档“统计学习方法李航.pdf”
添加成功:
4.4 知识库问答
对话模式选择知识库问答