XTuner微调个人小助手认知

outsideinthesun

已于 2024-09-01 22:57:00 修改

阅读量310

点赞数 3

文章标签：人工智能语言模型

于 2024-09-01 19:15:19 首次发布

本文链接：https://blog.csdn.net/outsideinthesun/article/details/141785417

版权

1 微调前置基础

想了解微调相关的基本概念，可以访问XTuner微调前置基础。

2 准备工作

在InternStudio上创建开发机，选择开发机镜像：Cuda12.2-conda。

创建完成后clone代码仓到本地：

mkdir -p /root/InternLM/Tutorial
git clone -b camp3  https://github.com/InternLM/Tutorial /root/InternLM/Tutorial

依次运行以下命令，具体作用参见注释

# 2.2 创建虚拟环境
# 创建虚拟环境
conda create -n xtuner0121 python=3.10 -y

# 激活虚拟环境（注意：后续的所有操作都需要在这个虚拟环境中进行）
conda activate xtuner0121

# 安装一些必要的库
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia -y
# 安装其他依赖
pip install transformers==4.39.3
pip install streamlit==1.36.0


# 2.3 安装 XTuner
# 创建一个目录，用来存放源代码
mkdir -p /root/InternLM/code

cd /root/InternLM/code
git clone -b v0.1.21  https://github.com/InternLM/XTuner /root/InternLM/code/XTuner

# 进入到源码目录
cd /root/InternLM/code/XTuner
conda activate xtuner0121
# 执行安装
pip install -e '.[deepspeed]'



# 检验是否正常安装
xtuner version
# 打开帮助
xtuner help


# 2.4 模型准备
# 创建一个目录，用来存放微调的所有资料，后续的所有操作都在该路径中进行
mkdir -p /root/InternLM/XTuner
cd /root/InternLM/XTuner
mkdir -p Shanghai_AI_Laboratory

ln -s /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b Shanghai_AI_Laboratory/internlm2-chat-1_8b


# 观察目录结构
apt-get install -y tree
tree -l

3 快速开始

用 internlm2-chat-1_8b 模型，通过 QLoRA 的方式来微调一个自己的小助手认知作为案例来进行演示。

3.1 微调前的模型对话

直接上命令行

# 直接启动应用
conda activate xtuner0121
streamlit run /root/InternLM/Tutorial/tools/xtuner_streamlit_demo.py

端口映射，在本地的Power Shell中执行如下命令：

# 最后的那个端口号43551根据开发机的端口号进行更改
ssh -CNg -L 8501:127.0.0.1:8501 root@ssh.intern-ai.org.cn -p 43551

浏览器输入网址：127.0.0.1:8051

3.2 指令跟随微调

# 准备数据文件
cd /root/InternLM/XTuner
mkdir -p datas
touch datas/assistant.json

# 复制一份xtuner_generate_assistant.py到当前目录下
cd /root/InternLM/XTuner
cp /root/InternLM/Tutorial/tools/xtuner_generate_assistant.py ./

# 上条指令执行完毕后将xtuner_generate_assistant.py中的'伍鲜同志'修改成你想要的名称


# 执行该脚本来生成数据文件
cd /root/InternLM/XTuner
conda activate xtuner0121

python xtuner_generate_assistant.py

查看执行后的目录结构

3.2.2 准备配置文件

# 查看XTuner 提供多个开箱即用的配置文件
conda activate xtuner0121

xtuner list-cfg -p internlm2

# 复制一个预设的配置文件
cd /root/InternLM/XTuner
conda activate xtuner0121

xtuner copy-cfg internlm2_chat_1_8b_qlora_alpaca_e3 .

修改默认的配置文件internlm2_chat_1_8b_qlora_alpaca_e3_copy.py，Tutorial/configs/internlm2_chat_1_8b_qlora_alpaca_e3_copy.py at camp3 · InternLM/Tutorial · GitHub

3.2.3 启动微调

cd /root/InternLM/XTuner
conda activate xtuner0121

xtuner train ./internlm2_chat_1_8b_qlora_alpaca_e3_copy.py

3.2.4 模型格式转换

cd /root/InternLM/XTuner
conda activate xtuner0121

# 先获取最后保存的一个pth文件
pth_file=`ls -t ./work_dirs/internlm2_chat_1_8b_qlora_alpaca_e3_copy/*.pth | head -n 1`
export MKL_SERVICE_FORCE_INTEL=1
export MKL_THREADING_LAYER=GNU
xtuner convert pth_to_hf ./internlm2_chat_1_8b_qlora_alpaca_e3_copy.py ${pth_file} ./hf

3.2.5 模型合并

cd /root/InternLM/XTuner
conda activate xtuner0121

export MKL_SERVICE_FORCE_INTEL=1
export MKL_THREADING_LAYER=GNU
xtuner convert merge /root/InternLM/XTuner/Shanghai_AI_Laboratory/internlm2-chat-1_8b ./hf ./merged --max-shard-size 2GB

3.3 微调后的模型对话

conda activate xtuner0121

streamlit run /root/InternLM/Tutorial/tools/xtuner_streamlit_demo.py

然后就是映射端口，请移步前面查看，在浏览器中输入链接127.0.0.1:8051访问，具体效果如下

部署过程中遇到了过拟合的问题，最后调整如下位置的数据，将该数据从8000改成3500后，解决了过拟合的问题。

每次修改上面这个值之后需要重新运行下如下的代码

cd /root/InternLM/XTuner
conda activate xtuner0121

python xtuner_generate_assistant.py

然后还需要重新走一遍训练，合并，部署等命令

cd /root/InternLM/XTuner
conda activate xtuner0121

xtuner train ./internlm2_chat_1_8b_qlora_alpaca_e3_copy.py


# 先获取最后保存的一个pth文件
pth_file=`ls -t ./work_dirs/internlm2_chat_1_8b_qlora_alpaca_e3_copy/*.pth | head -n 1`
export MKL_SERVICE_FORCE_INTEL=1
export MKL_THREADING_LAYER=GNU
xtuner convert pth_to_hf ./internlm2_chat_1_8b_qlora_alpaca_e3_copy.py ${pth_file} ./hf

cd /root/InternLM/XTuner

export MKL_SERVICE_FORCE_INTEL=1
export MKL_THREADING_LAYER=GNU
xtuner convert merge /root/InternLM/XTuner/Shanghai_AI_Laboratory/internlm2-chat-1_8b ./hf ./merged --max-shard-size 2GB

streamlit run /root/InternLM/Tutorial/tools/xtuner_streamlit_demo.py

# 结束

效果还是非常好的

outsideinthesun

关注

3
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
XTuner微调个人小助手认知

修改默认的配置文件internlm2_chat_1_8b_qlora_alpaca_e3_copy.py，然后就是映射端口，请移步前面查看，在浏览器中输入链接127.0.0.1:8051访问，具体效果如下。在InternStudio上创建开发机，选择开发机镜像：Cuda12.2-conda。的方式来微调一个自己的小助手认知作为案例来进行演示。浏览器输入网址：127.0.0.1:8051。想了解微调相关的基本概念，可以访问。依次运行以下命令，具体作用参见注释。查看执行后的目录结构。
复制链接

扫一扫