基于SWIFT和Qwen1.5-0.5B-Chat进行大模型全参微调（魔搭社区）

m0_65156252

已于 2024-09-24 17:36:43 修改

阅读量728

点赞数 21

分类专栏：大模型学习笔记文章标签： python 开发语言

于 2024-09-24 17:35:56 首次发布

本文链接：https://blog.csdn.net/m0_65156252/article/details/142493500

版权

大模型学习笔记专栏收录该内容

8 篇文章 0 订阅

订阅专栏

一、环境安装

因魔搭社区中的Notebook有自己的机制来处理虚拟环境,因此我们直接使用原生镜像即可。

1，swift框架安装

pip install ms-swift

2，模型下载

git clone https://www.modelscope.cn/qwen/Qwen1.5-0.5B-Chat.git

二、数据集准备

使用ShenNong大模型-中医对话数据，选取前140条构建自己数据集ChatMed_TCM-v0.2.json。

数据集下载：git clone https://www.modelscope.cn/datasets/xiaofengalg/ShenNong_TCM_Dataset.git

三、微调

编写微调脚本：vim run.sh

输入--

CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model_type qwen1half-0_5b-chat \
--model_id_or_path /mnt/workspace/Qwen1.5-0.5B-Chat \
--model_revision master \
--sft_type full \
--tuner_backend swift \
--template_type AUTO \
--dtype AUTO \
--output_dir ./llm_sft_output \
--ddp_backend nccl \
--custom_train_dataset_path /mnt/workspace/dataset/ChatMed_TCM-v0.2.json \
--train_dataset_sample -1 \
--num_train_epochs 1 \
--max_length 4096 \
--check_dataset_strategy warning \
--gradient_checkpointing true \
--batch_size 1 \
--weight_decay 0.01 \
--learning_rate 1e-4 \
--gradient_accumulation_steps $(expr 8 / $nproc_per_node) \
--max_grad_norm 0.5 \
--warmup_ratio 0.03 \
--eval_steps 100 \
--save_steps 100 \
--save_total_limit 3 \
--logging_steps 10 \
--use_flash_attn false \
--save_only_model true \
--self_cognition_sample 500 \
--model_name "专属AI助手" "Dedicated AI Assistant" \
--model_author "技术团队" "Tech Team"