一 启动xinference
1 进入软件安装目录
2 执行启动命令
linux下启动xinference xinference-local --host 0.0.0.0 --port 9997
windows下启动xinference-local --host 127.0.0.1 --port 9997
3 打开浏览器 http://127.0.0.1:9997
二 启动langchain-chatchat
1 初始化python chatchat/cli.py init
2 加载知识库
3 python chatchat/cli.py start -a
三 问题解决
1 windows运行Xinference报错找不到llama.dll
版本问题,从0.2.90降低到0.2.0问题解决
pip install llama-cpp-python==0.2.0 --extra-index-url https://abetlen.github
.io/llama-cpp-python/whl/cpu
2 pip install gradio==4.21.0
3 pip install chatglm_cpp-0.4.2-cp311-cp311-win_amd64.whl
4 修改缓存的存储位置
(Anaconda3的安装文件夹)\envs\Xinference\Lib\site-packages\xinference\constants.py
在get_xinference_home()函数中做以下修改:
def get_xinference_home() -> str:
home_path = r"E:\XinferenceFiles"
if home_path is None:
home_path = str(Path.home() / ".xinference")
else:
# if user has already set `XINFERENCE_HOME` env, change huggingface and modelscope default download path
os.environ["HUGGINGFACE_HUB_CACHE"] = os.path.join(home_path, "huggingface")
os.environ["MODELSCOPE_CACHE"] = os.path.join(home_path, "modelscope")
# In multi-tenant mode,
# gradio's temporary files are stored in their respective home directories,
# to prevent insufficient permissions
# os.environ["GRADIO_TEMP_DIR"] = os.path.join(home_path, "tmp", "gradio")
return home_path