Trl SFT: llama2-7b-hf使用QLora 4bit量化后ds zero3加上flash atten v2单机多卡训练(笔记)

自学AI的鲨鱼儿

已于 2025-02-24 11:42:41 修改

阅读量2.9k

点赞数 20

分类专栏： # 训练文章标签：个人笔记深度学习 LLM

于 2024-04-12 13:46:22 首次发布

本文链接：https://blog.csdn.net/qq_16555103/article/details/137677561

版权

训练专栏收录该内容

10 篇文章

订阅专栏

一、环境

1.1、环境安装

1.2、安装flash atten

1.3、vscode远端可能遇到的一些问题

2.4.2.1 training args

3.2.1、datasets.map 使用 load_from_cache_file = False 方便调试

四、小结

4.1、在SFTTrainer初始化peft模型时，为什么开启了 QLoRA + FSDP / DS-Zero3 后不使用prepare_model_for_kbit_training 和 peft_module_casting_to_bf16 ，prepare_model_for_kbit_training 和 peft_module_casting_to_bf16 做了什么？QLoRA + FSDP / DS-Zero3 未开启offload模型加载后model为什么在cpu上？

4.2、bfloat16和float16的区别

4.3、绝对位置编码与相对位置编码的区别，为什么现在的大模型都使用RoPE

五、Trl 其他Trainer注释笔记

5.1、DPOTrainer笔记

5.2、...

项目地址

peft/examples/sft at main · huggingface/peft · GitHub🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning. - peft/examples/sft at main · huggingface/pefthttps://github.com/huggingface/peft/tree/main/examples/sft

文档

https://huggingface.co/docs/peft/accelerate/deepspeedhttps://huggingface.co/docs/peft/accelerate/deepspeed

一、环境

系统：ubuntu 
cuda版本：12.1
torch版本：2.2.0
python版本：3.10

conda 虚拟环境中 cuda版本
cuda：12.1  # 确保与"外界"cuda一致

1.1、环境安装

pip install -r ...

第一种

a) 2024年4月28日更新

git+https://github.com/huggingface/accelerate
git+https://github.com/huggingface/peft
git+https://github.com/huggingface/trl
git+https://github.com/huggingface/datatrove.git
unsloth[conda]@git+https://github.com/unslothai/unsloth.git
git+https://github.com/huggingface/transformers
deepspeed==0.14.0
PyGithub
# flash-attn==2.5.7 单独安装
# 第一 确保 linux "外界"的 cuda版本 与 conda 虚拟环境中cuda版本一致
# 第二 安装好 c++ g++ ninja (c++ g++ Ninjia 安装版本过低后续安装可能会失败)
# 第三 参考官方命令:
huggingface-hub
evaluate
datasets
bitsandbytes
einops
wandb
tensorboard
tiktoken
pandas
numpy
scipy
matplotlib
sentencepiece
nltk
xformers
hf_transfer

loguru
tqdm
transformers_stream_generator
torch==2.2.1
openpyxl
httpx
joblib
scikit_learn

b) 2024年7月7日更新（增加了vllm）

git+https://github.com/huggingface/accelerate
git+https://github.com/huggingface/peft
# git+https://github.com/huggingface/trl
git+https://github.com/huggingface/datatrove.git
git+https://github.com/huggingface/transformers

unsloth[conda]@git+https://github.com/unslothai/unsloth.git
trl==0.8.6
# flash-attn==2.5.9.post1 单独安装
deepspeed==0.14.0
torch==2.3.0
vllm
# pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3  # 单独安装 flash atten 推理内核(vLLM), 仅适配 For CUDA 12.1 & torch 2.3
# vllm-flash-attn==2.5.9

ray
numpy==1.26.4
PyGithub
huggingface-hub
evaluate
datasets
bitsandbytes
einops
wandb
tensorboard
tiktoken
pandas
scipy
matplotlib
sentencepiece
nltk
xformers
hf_transfer
loguru
tqdm
transformers_stream_generator
openpyxl
httpx
joblib
scikit_learn

2024年6月19日更新（增加了vllm、ray）

# git+https://github.com/huggingface/accelerate
# git+https://github.com/huggingface/peft
# git+https://github.com/huggingface/trl
# git+https://github.com/huggingface/datatrove.git
# git+https://github.com/huggingface/transformers

unsloth[conda]@git+https://github.com/unslothai/unsloth.git
accelerate==0.31.0
peft==0.11.1
datatrove==0.2.0
trl==0.8.6
transformers==4.41.2
# flash-attn==2.5.9.post1 单独安装
# 第一 确保 linux "外界"的 cuda版本 与 conda 虚拟环境中cuda版本一致
# 第二 安装好 c++ g++ ninja (c++ g++ Ninjia 安装版本过低后续安装可能会失败)
# 第三 参考官方命令:
deepspeed==0.14.0
torch==2.3.0
vllm==0.5.0.post1
vllm-flash-attn==2.5.9

ray
numpy==1.26.4
PyGithub
huggingface-hub
evaluate
datasets
bitsandbytes
einops
wandb
tensorboard
tiktoken
pandas
scipy
matplotlib
sentencepiece
nltk
xformers
hf_transfer
loguru
tqdm
transformers_stream_generator
openpyxl
httpx
joblib
scikit_learn

若上述pip安装包更新最新导致版本不匹配，可以参考下面第二种或第三种包版本适当修改

第二种

a) (zero3 peft lora 为bf16), 2024年4月28日更新

absl-py==2.1.0
accelerate==0.30.0
aiohttp==3.9.5
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.3.0
async-timeout==4.0.3
attrs==23.2.0
bitsandbytes==0.43.1
certifi==2024.2.2
cffi==1.16.0
charset-normalizer==3.3.2
click==8.1.7
contourpy==1.2.1
cryptography==42.0.7
cycler==0.12.1
datasets==2.19.1
datatrove==0.2.0
deepspeed==0.14.0
Deprecated==1.2.14
dill==0.3.8
docker-pycreds==0.4.0
docstring_parser==0.16
einops==0.8.0
et-xmlfile==1.1.0
evaluate==0.4.2
exceptiongroup==1.2.1
filelock==3.14.0
# flash-attn==2.5.7 单独安装
# 第一 确保 linux "外界"的 cuda版本 与 conda 虚拟环境中cuda版本一致
# 第二 安装好 c++ g++ ninja (c++ g++ Ninjia 安装版本过低后续安装可能会失败)
# 第三 参考官方命令:
fonttools==4.51.0
frozenlist==1.4.1
fsspec==2024.3.1
gitdb==4.0.11
GitPython==3.1.43
grpcio==1.64.0
h11==0.14.0
hf_transfer==0.1.6
hjson==3.1.0
httpcore==1.0.5
httpx==0.27.0
huggingface-hub==0.23.1
humanize==4.9.0
idna==3.7
Jinja2==3.1.4
joblib==1.4.2
kiwisolver==1.4.5
loguru==0.7.2
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.9.0
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.16
networkx==3.3
ninja==1.11.1.1
nltk==3.8.1
numpy==1.26.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.5.40
nvidia-nvtx-cu12==12.1.105
openpyxl==3.1.2
packaging==24.0
pandas==2.2.2
peft==0.10.0
pillow==10.3.0
pip==24.0
platformdirs==4.2.2
protobuf==3.20.3
psutil==5.9.8
py-cpuinfo==9.0.0
pyarrow==16.1.0
pyarrow-hotfix==0.6
pycparser==2.22
pydantic==2.7.1
pydantic_core==2.18.2
PyGithub==2.3.0
Pygments==2.18.0
PyJWT==2.8.0
PyNaCl==1.5.0
pynvml==11.5.0
pyparsing==3.1.2
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.1
regex==2024.5.15
requests==2.32.2
rich==13.7.1
safetensors==0.4.3
scikit-learn==1.5.0
scipy==1.13.1
sentencepiece==0.2.0
sentry-sdk==2.3.1
setproctitle==1.3.3
setuptools==69.5.1
shtab==1.7.1
six==1.16.0
smmap==5.0.1
sniffio==1.3.1
sympy==1.12
tensorboard==2.16.2
tensorboard-data-server==0.7.2
threadpoolctl==3.5.0
tiktoken==0.7.0
tokenizers==0.19.1
torch==2.2.1
tqdm==4.66.4
transformers==4.40.1
transformers-stream-generator==0.0.5
triton==2.2.0
trl==0.8.3
typing_extensions==4.12.0
tyro==0.8.4
tzdata==2024.1
unsloth==2024.5
urllib3==2.2.1
wandb==0.17.0
Werkzeug==3.0.3
wheel==0.43.0
wrapt==1.16.0
xformers==0.0.25
xxhash==3.4.1
yarl==1.9.4

b) (zero3 peft lora 为float32)，2024年6月10日更新

Package                       Version
----------------------------- -----------
absl-py                       2.1.0
accelerate                    0.31.0.dev0
aiohttp                       3.9.5
aiosignal                     1.3.1
annotated-types               0.7.0
anyio                         4.4.0
async-timeout                 4.0.3
attrs                         23.2.0
bitsandbytes                  0.43.1
certifi                       2024.6.2
cffi                          1.16.0
charset-normalizer            3.3.2
click                         8.1.7
contourpy                     1.2.1
cryptography                  42.0.8
cycler                        0.12.1
datasets                      2.19.2
datatrove                     0.2.0
deepspeed                     0.14.0
Deprecated                    1.2.14
dill                          0.3.8
docker-pycreds                0.4.0
docstring_parser              0.16
einops                        0.8.0
et-xmlfile                    1.1.0
evaluate                      0.4.2
exceptiongroup                1.2.1
filelock                      3.14.0
fonttools                     4.53.0
frozenlist                    1.4.1
fsspec                        2024.3.1
gitdb                         4.0.11
GitPython                     3.1.43
grpcio                        1.64.1
h11                           0.14.0
hf_transfer                   0.1.6
hjson                         3.1.0
httpcore                      1.0.5
httpx                         0.27.0
huggingface-hub               0.23.3
humanize                      4.9.0
idna                          3.7
Jinja2                        3.1.4
joblib                        1.4.2
kiwisolver                    1.4.5
loguru                        0.7.2
Markdown                      3.6
markdown-it-py                3.0.0
MarkupSafe                    2.1.5
matplotlib                    3.9.0
mdurl                         0.1.2
mpmath                        1.3.0
multidict                     6.0.5
multiprocess                  0.70.16
networkx                      3.3
ninja                         1.11.1.1
nltk                          3.8.1
numpy                         1.26.4
nvidia-cublas-cu12            12.1.3.1
nvidia-cuda-cupti-cu12        12.1.105
nvidia-cuda-nvrtc-cu12        12.1.105
nvidia-cuda-runtime-cu12      12.1.105
nvidia-cudnn-cu12             8.9.2.26
nvidia-cufft-cu12             11.0.2.54
nvidia-curand-cu12            10.3.2.106
nvidia-cusolver-cu12          11.4.5.107
nvidia-cusparse-cu12          12.1.0.106
nvidia-nccl-cu12              2.19.3
nvidia-nvjitlink-cu12         12.5.40
nvidia-nvtx-cu12              12.1.105
openpyxl                      3.1.3
packaging                     24.0
pandas                        2.2.2
peft                          0.11.2.dev0
pillow                        10.3.0
pip                           24.0
platformdirs                  4.2.2
protobuf                      3.20.3
psutil                        5.9.8
py-cpuinfo                    9.0.0
pyarrow                       16.1.0
pyarrow-hotfix                0.6
pycparser                     2.22
pydantic                      2.7.3
pydantic_core                 2.18.4
PyGithub                      2.3.0
Pygments                      2.18.0
PyJWT                         2.8.0
PyNaCl                        1.5.0
pynvml                        11.5.0
pyparsing                     3.1.2
python-dateutil               2.9.0.post0
pytz                          2024.1
PyYAML                        6.0.1
regex                         2024.5.15
requests                      2.32.3
rich                          13.7.1
safetensors                   0.4.3
scikit-learn                  1.5.0
scipy                         1.13.1
sentencepiece                 0.2.0
sentry-sdk                    2.4.0
setproctitle                  1.3.3
setuptools                    69.5.1
shtab                         1.7.1
six                           1.16.0
smmap                         5.0.1
sniffio                       1.3.1
sympy                         1.12.1
tensorboard                   2.16.2
tensorboard-data-server       0.7.2
threadpoolctl                 3.5.0
tiktoken                      0.7.0
tokenizers                    0.19.1
torch                         2.2.1
tqdm                          4.66.4
transformers                  4.42.0.dev0
transformers-stream-generator 0.0.5
triton                        2.2.0
trl                           0.8.6
typing_extensions             4.12.1
tyro                          0.8.4
tzdata                        2024.1
unsloth                       2024.5
urllib3                       2.2.1
wandb                         0.17.0
Werkzeug                      3.0.3
wheel                         0.43.0
wrapt                         1.16.0
xformers                      0.0.25
xxhash                        3.4.1
yarl                          1.9.4