更改chatglm认知

假装我不帅

已于 2024-04-10 13:45:53 修改

阅读量2.1k

点赞数 43

分类专栏： AI 文章标签： chatglm 通义千问

于 2024-03-29 18:10:29 首次发布

本文链接：https://blog.csdn.net/qq_36437991/article/details/137020729

版权

AI 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

ChatGLM-Efficient-Tuning

下载源代码

下载ChatGLM-Efficient-Tuning
解压
在这里插入图片描述

创建虚拟环境

conda create --prefix=D:\CondaEnvs\chatglm6btrain python=3.10
cd D:\ChatGLM-Efficient-Tuning-main
conda activate D:\CondaEnvs\chatglm6btrain

安装所需要的包

pip install -r requirements.txt

在这里插入图片描述

修改测试数据

修改data下self_cognition.json
NAME和AUTHOR修改为自己想起的名字即可

训练

如果要在 Windows 平台上开启量化 LoRA（QLoRA），需要安装预编译的 bitsandbytes 库, 支持 CUDA 11.1 到 12.1.
查看cuda版本

nvcc --version

在这里插入图片描述
满足条件，安装windows下的LoRA

pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl

在这里插入图片描述

开始训练

单 GPU 微调训练

# 选择gpu显卡二选一，看自己的操作系统
# linux
# CUDA_VISIBLE_DEVICES=0 
# windows
# set CUDA_VISIBLE_DEVICES=0
python src/train_bash.py --stage sft --model_name_or_path path_to_your_chatglm_model --do_train --dataset alpaca_gpt4_zh --finetuning_type lora --output_dir path_to_sft_checkpoint --per_device_train_batch_size 4 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --logging_steps 10 --save_steps 1000 --learning_rate 5e-5 --num_train_epochs 3.0 --plot_loss --fp16

在这里插入图片描述

AttributeError: type object ‘PPODecorators’ has no attribute ‘empty_cuda_cache’. Did you mean: ‘empty_device_cache’?

修改trl版本trl==0.7.2

pip install trl==0.7.2

在这里插入图片描述
ImportError: cannot import name ‘top_k_top_p_filtering’ from ‘transformers’

pip install torch==1.13.1

在这里插入图片描述

pip install accelerate==0.21.0
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia

ImportError: cannot import name ‘COMMON_SAFE_ASCII_CHARACTERS’ from 'charset_normalizer.constant

pip install chardet

cannot import name ‘LRScheduler’ from ‘torch.optim.lr_scheduler’

pip install transformers==4.29.1

在这里插入图片描述

下载数据集
https://huggingface.co/THUDM/chatglm-6b

python src/train_bash.py --stage sft --model_name_or_path path_to_your_chatglm_model --do_train --dataset self_cognition --finetuning_type lora --output_dir path_to_sft_checkpoint --per_device_train_batch_size 4 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --logging_steps 10 --save_steps 1000 --learning_rate 5e-5 --num_train_epochs 3.0 --plot_loss --fp16 --model_name_or_path chatglm-6b

在这里插入图片描述

ValueError: Attempting to unscale FP16 gradients

pip install peft==0.4.0

在这里插入图片描述
Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
修改train_bash.py

import os
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

在这里插入图片描述
或者设置一下环境变量

set KMP_DUPLICATE_LIB_OK=TRUE

在这里插入图片描述

测试训练结果

python src/cli_demo.py --model_name_or_path chatglm-6b --checkpoint_dir path_to_sft_checkpoint

在这里插入图片描述
训练的结果好像并不理想

下载0.1.0版本试试

git lfs install
git clone -b v0.1.0 https://huggingface.co/THUDM/chatglm-6b

python src/train_bash.py --stage sft --model_name_or_path path_to_your_chatglm_model --do_train --dataset self_cognition --finetuning_type lora --output_dir path_to_sft_checkpoint --per_device_train_batch_size 4 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --logging_steps 10 --save_steps 1000 --learning_rate 5e-5 --num_train_epochs 3.0 --plot_loss --fp16 --model_name_or_path chatglm6b010

python src/cli_demo.py --model_name_or_path chatglm6b010 --checkpoint_dir path_to_sft_checkpoint

在这里插入图片描述

LLaMA-Efficient-Tuning

下载源代码

尝试还是不行,尝试LLaMA-Efficient-Tuning
下载源代码解压，创建新的虚拟环境
在这里插入图片描述

conda create --prefix=D:\CondaEnvs\llama python=3.10
cd D:\LLaMA-Factory-main
conda activate D:\CondaEnvs\llama

安装所需要的包
在这里插入图片描述

# pytorch GPU版本
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
# pytorch CPU版基本不行
# pip install torch==1.13.1+cpu torchvision==0.14.1+cpu torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cpu
# pytorch cuda 12.1
# conda install pytorch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install transformers==4.37.2
pip install datasets==2.14.3
pip install accelerate==0.27.2
pip install peft==0.9.0
pip install trl==0.8.1

pip install -r requirements.txt
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl

如果您在 Hugging Face 模型和数据集的下载中遇到了问题，可以通过下述方法使用魔搭社区。

# linux
# export USE_MODELSCOPE_HUB=1 
# Windows 
set USE_MODELSCOPE_HUB=1

接着即可通过指定模型名称来训练对应的模型

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --model_name_or_path modelscope/Llama-2-7b-ms \
    ... # 参数同下

开启网页

# set CUDA_VISIBLE_DEVICES=0 
python src/train_web.py

在这里插入图片描述

命令行使用

set CUDA_VISIBLE_DEVICES=0 
python src/train_bash.py --stage pt --do_train --model_name_or_path path_to_llama_model --dataset wiki_demo --finetuning_type lora --lora_target q_proj,v_proj --output_dir path_to_pt_checkpoint --overwrite_cache --per_device_train_batch_size 4 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --logging_steps 10 --save_steps 1000 --learning_rate 5e-5 --num_train_epochs 3.0 --plot_loss --fp16

qwen1.5-0.5b模型huggingface
qwen1.5-0.5b模型魔搭社区
在这里插入图片描述

在这里插入图片描述

预览命令

python src/train_bash.py --stage sft --do_train True --model_name_or_path Qwen/Qwen1.5-0.5B-Chat --finetuning_type lora --template qwen --dataset_dir data  --dataset identity,alpaca_gpt4_zh --cutoff_len 1024 --learning_rate 0.0002 --num_train_epochs 5.0 --max_samples 500 --per_device_train_batch_size 4  --gradient_accumulation_steps 4 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 100 --warmup_steps 0 --optim adamw_torch --output_dir saves\Qwen1.5-0.5B-Chat\lora\test --fp16 True --lora_rank 8 --lora_alpha 16 --lora_dropout 0.1 --lora_target all --use_dora True --plot_loss True

NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported.
在这里插入图片描述

pip install fsspec==2023.9.2

在这里插入图片描述

训练完毕，刷新适配器然后加载
在这里插入图片描述

chatglm类似，它支持很多模型

如何提供接口

查看接口参数

python src/api_demo.py -h 

 -h, --help            show this help message and exit
  --model_name_or_path MODEL_NAME_OR_PATH
                        Path to the model weight or identifier from huggingface.co/models or modelscope.cn/models. (default: None)
  --adapter_name_or_path ADAPTER_NAME_OR_PATH
                        Path to the adapter weight or identifier from huggingface.co/models. (default: None)
  --cache_dir CACHE_DIR
                        Where to store the pre-trained models downloaded from huggingface.co or modelscope.cn. (default: None)
  --use_fast_tokenizer [USE_FAST_TOKENIZER]
                        Whether or not to use one of the fast tokenizer (backed by the tokenizers library). (default: False)
  --resize_vocab [RESIZE_VOCAB]
                        Whether or not to resize the tokenizer vocab and the embedding layers. (default: False)
  --split_special_tokens [SPLIT_SPECIAL_TOKENS]
                        Whether or not the special tokens should be split during the tokenization process. (default: False)
  --model_revision MODEL_REVISION
                        The specific model version to use (can be a branch name, tag name or commit id). (default: main)
  --low_cpu_mem_usage [LOW_CPU_MEM_USAGE]
                        Whether or not to use memory-efficient model loading. (default: True)
  --no_low_cpu_mem_usage
                        Whether or not to use memory-efficient model loading. (default: False)
  --quantization_bit QUANTIZATION_BIT
                        The number of bits to quantize the model using bitsandbytes. (default: None)
  --quantization_type {fp4,nf4}
                        Quantization data type to use in int4 training. (default: nf4)
  --double_quantization [DOUBLE_QUANTIZATION]
                        Whether or not to use double quantization in int4 training. (default: True)
  --no_double_quantization
                        Whether or not to use double quantization in int4 training. (default: False)
  --rope_scaling {linear,dynamic}
                        Which scaling strategy should be adopted for the RoPE embeddings. (default: None)
  --flash_attn [FLASH_ATTN]
                        Enable FlashAttention-2 for faster training. (default: False)
  --shift_attn [SHIFT_ATTN]
                        Enable shift short attention (S^2-Attn) proposed by LongLoRA. (default: False)
  --use_unsloth [USE_UNSLOTH]
                        Whether or not to use unsloth's optimization for the LoRA training. (default: False)
  --disable_gradient_checkpointing [DISABLE_GRADIENT_CHECKPOINTING]
                        Whether or not to disable gradient checkpointing. (default: False)
  --upcast_layernorm [UPCAST_LAYERNORM]
                        Whether or not to upcast the layernorm weights in fp32. (default: False)
  --upcast_lmhead_output [UPCAST_LMHEAD_OUTPUT]
                        Whether or not to upcast the output of lm_head in fp32. (default: False)
  --infer_backend {huggingface,vllm}
                        Backend engine used at inference. (default: huggingface)
  --vllm_maxlen VLLM_MAXLEN
                        Maximum input length of the vLLM engine. (default: 2048)
  --vllm_gpu_util VLLM_GPU_UTIL
                        The fraction of GPU memory in (0,1) to be used for the vLLM engine. (default: 0.9)
  --vllm_enforce_eager [VLLM_ENFORCE_EAGER]
                        Whether or not to disable CUDA graph in the vLLM engine. (default: False)
  --offload_folder OFFLOAD_FOLDER
                        Path to offload model weights. (default: offload)
  --use_cache [USE_CACHE]
                        Whether or not to use KV cache in generation. (default: True)
  --no_use_cache        Whether or not to use KV cache in generation. (default: False)
  --hf_hub_token HF_HUB_TOKEN
                        Auth token to log in with Hugging Face Hub. (default: None)
  --ms_hub_token MS_HUB_TOKEN
                        Auth token to log in with ModelScope Hub. (default: None)
  --export_dir EXPORT_DIR
                        Path to the directory to save the exported model. (default: None)
  --export_size EXPORT_SIZE
                        The file shard size (in GB) of the exported model. (default: 1)
  --export_quantization_bit EXPORT_QUANTIZATION_BIT
                        The number of bits to quantize the exported model. (default: None)
  --export_quantization_dataset EXPORT_QUANTIZATION_DATASET
                        Path to the dataset or dataset name to use in quantizing the exported model. (default: None)
  --export_quantization_nsamples EXPORT_QUANTIZATION_NSAMPLES
                        The number of samples used for quantization. (default: 128)
  --export_quantization_maxlen EXPORT_QUANTIZATION_MAXLEN
                        The maximum length of the model inputs used for quantization. (default: 1024)
  --export_legacy_format [EXPORT_LEGACY_FORMAT]
                        Whether or not to save the `.bin` files instead of `.safetensors`. (default: False)
  --export_hub_model_id EXPORT_HUB_MODEL_ID
                        The name of the repository if push the model to the Hugging Face hub. (default: None)
  --print_param_status [PRINT_PARAM_STATUS]
                        For debugging purposes, print the status of the parameters in the model. (default: False)
  --template TEMPLATE   Which template to use for constructing prompts in training and inference. (default: None)
  --dataset DATASET     The name of provided dataset(s) to use. Use commas to separate multiple datasets. (default: None)
  --dataset_dir DATASET_DIR
                        Path to the folder containing the datasets. (default: data)
  --split SPLIT         Which dataset split to use for training and evaluation. (default: train)
  --cutoff_len CUTOFF_LEN
                        The cutoff length of the model inputs after tokenization. (default: 1024)
  --reserved_label_len RESERVED_LABEL_LEN
                        The minimum cutoff length reserved for label after tokenization. (default: 1)
  --train_on_prompt [TRAIN_ON_PROMPT]
                        Whether to disable the mask on the prompt or not. (default: False)
  --streaming [STREAMING]
                        Enable dataset streaming. (default: False)
  --buffer_size BUFFER_SIZE
                        Size of the buffer to randomly sample examples from in dataset streaming. (default: 16384)
  --mix_strategy {concat,interleave_under,interleave_over}
                        Strategy to use in dataset mixing (concat/interleave) (undersampling/oversampling). (default: concat)
  --interleave_probs INTERLEAVE_PROBS
                        Probabilities to sample data from datasets. Use commas to separate multiple datasets. (default: None)
  --overwrite_cache [OVERWRITE_CACHE]
                        Overwrite the cached training and evaluation sets. (default: False)
  --preprocessing_num_workers PREPROCESSING_NUM_WORKERS
                        The number of processes to use for the pre-processing. (default: None)
  --max_samples MAX_SAMPLES
                        For debugging purposes, truncate the number of examples for each dataset. (default: None)
  --eval_num_beams EVAL_NUM_BEAMS
                        Number of beams to use for evaluation. This argument will be passed to `model.generate` (default: None)
  --ignore_pad_token_for_loss [IGNORE_PAD_TOKEN_FOR_LOSS]
                        Whether or not to ignore the tokens corresponding to padded labels in the loss computation. (default: True)
  --no_ignore_pad_token_for_loss
                        Whether or not to ignore the tokens corresponding to padded labels in the loss computation. (default: False)
  --val_size VAL_SIZE   Size of the development set, should be an integer or a float in range `[0,1)`. (default: 0.0)
  --packing PACKING     Whether or not to pack the sequences in training. Will automatically enable in pre-training. (default: None)
  --cache_path CACHE_PATH
                        Path to save or load the pre-processed datasets. (default: None)
  --use_galore [USE_GALORE]
                        Whether or not to use gradient low-Rank projection. (default: False)
  --galore_target GALORE_TARGET
                        Name(s) of modules to apply GaLore. Use commas to separate multiple modules. Use "all" to specify all the linear modules. (default: all)
  --galore_rank GALORE_RANK
                        The rank of GaLore gradients. (default: 16)
  --galore_update_interval GALORE_UPDATE_INTERVAL
                        Number of steps to update the GaLore projection. (default: 200)
  --galore_scale GALORE_SCALE
                        GaLore scaling coefficient. (default: 0.25)
  --galore_proj_type {std,reverse_std,right,left,full}
                        Type of GaLore projection. (default: std)
  --galore_layerwise [GALORE_LAYERWISE]
                        Whether or not to enable layer-wise update to further save memory. (default: False)
  --dpo_beta DPO_BETA   The beta parameter for the DPO loss. (default: 0.1)
  --dpo_loss {sigmoid,hinge,ipo,kto_pair}
                        The type of DPO loss to use. (default: sigmoid)
  --dpo_label_smoothing DPO_LABEL_SMOOTHING
                        The robust DPO label smoothing parameter in cDPO that should be between 0 and 0.5. (default: 0.0)
  --dpo_ftx DPO_FTX     The supervised fine-tuning loss coefficient in DPO training. (default: 0.0)
  --ppo_buffer_size PPO_BUFFER_SIZE
                        The number of mini-batches to make experience buffer in a PPO optimization step. (default: 1)
  --ppo_epochs PPO_EPOCHS
                        The number of epochs to perform in a PPO optimization step. (default: 4)
  --ppo_score_norm [PPO_SCORE_NORM]
                        Use score normalization in PPO training. (default: False)
  --ppo_target PPO_TARGET
                        Target KL value for adaptive KL control in PPO training. (default: 6.0)
  --ppo_whiten_rewards [PPO_WHITEN_REWARDS]
                        Whiten the rewards before compute advantages in PPO training. (default: False)
  --ref_model REF_MODEL
                        Path to the reference model used for the PPO or DPO training. (default: None)
  --ref_model_adapters REF_MODEL_ADAPTERS
                        Path to the adapters of the reference model. (default: None)
  --ref_model_quantization_bit REF_MODEL_QUANTIZATION_BIT
                        The number of bits to quantize the reference model. (default: None)
  --reward_model REWARD_MODEL
                        Path to the reward model used for the PPO training. (default: None)
  --reward_model_adapters REWARD_MODEL_ADAPTERS
                        Path to the adapters of the reward model. (default: None)
  --reward_model_quantization_bit REWARD_MODEL_QUANTIZATION_BIT
                        The number of bits to quantize the reward model. (default: None)
  --reward_model_type {lora,full,api}
                        The type of the reward model in PPO training. Lora model only supports lora training. (default: lora)
  --additional_target ADDITIONAL_TARGET
                        Name(s) of modules apart from LoRA layers to be set as trainable and saved in the final checkpoint. (default: None)
  --lora_alpha LORA_ALPHA
                        The scale factor for LoRA fine-tuning (default: lora_rank * 2). (default: None)
  --lora_dropout LORA_DROPOUT
                        Dropout rate for the LoRA fine-tuning. (default: 0.0)
  --lora_rank LORA_RANK
                        The intrinsic dimension for LoRA fine-tuning. (default: 8)
  --lora_target LORA_TARGET
                        Name(s) of target modules to apply LoRA. Use commas to separate multiple modules. Use "all" to specify all the linear modules. LLaMA choices: ["q_proj", "k_proj", "v_proj", "o_proj",
                        "gate_proj", "up_proj", "down_proj"], BLOOM & Falcon & ChatGLM choices: ["query_key_value", "dense", "dense_h_to_4h", "dense_4h_to_h"], Baichuan choices: ["W_pack", "o_proj",
                        "gate_proj", "up_proj", "down_proj"], Qwen choices: ["c_attn", "attn.c_proj", "w1", "w2", "mlp.c_proj"], InternLM2 choices: ["wqkv", "wo", "w1", "w2", "w3"], Others choices: the same as
                        LLaMA. (default: all)
  --loraplus_lr_ratio LORAPLUS_LR_RATIO
                        LoRA plus learning rate ratio (lr_B / lr_A). (default: None)
  --loraplus_lr_embedding LORAPLUS_LR_EMBEDDING
                        LoRA plus learning rate for lora embedding layers. (default: 1e-06)
  --use_rslora [USE_RSLORA]
                        Whether or not to use the rank stabilization scaling factor for LoRA layer. (default: False)
  --use_dora [USE_DORA]
                        Whether or not to use the weight-decomposed lora method (DoRA). (default: False)
  --create_new_adapter [CREATE_NEW_ADAPTER]
                        Whether or not to create a new adapter with randomly initialized weight. (default: False)
  --name_module_trainable NAME_MODULE_TRAINABLE
                        Name of trainable modules for partial-parameter (freeze) fine-tuning. Use commas to separate multiple modules. Use "all" to specify all the available modules. LLaMA choices: ["mlp",
                        "self_attn"], BLOOM & Falcon & ChatGLM choices: ["mlp", "self_attention"], Qwen choices: ["mlp", "attn"], InternLM2 choices: ["feed_forward", "attention"], Others choices: the same as
                        LLaMA. (default: all)
  --num_layer_trainable NUM_LAYER_TRAINABLE
                        The number of trainable layers for partial-parameter (freeze) fine-tuning. (default: 2)
  --pure_bf16 [PURE_BF16]
                        Whether or not to train model in purely bf16 precision (without AMP). (default: False)
  --stage {pt,sft,rm,ppo,dpo}
                        Which stage will be performed in training. (default: sft)
  --finetuning_type {lora,freeze,full}
                        Which fine-tuning method to use. (default: lora)
  --use_llama_pro [USE_LLAMA_PRO]
                        Whether or not to make only the parameters in the expanded blocks trainable. (default: False)
  --plot_loss [PLOT_LOSS]
                        Whether or not to save the training loss curves. (default: False)
  --do_sample [DO_SAMPLE]
                        Whether or not to use sampling, use greedy decoding otherwise. (default: True)
  --no_do_sample        Whether or not to use sampling, use greedy decoding otherwise. (default: False)
  --temperature TEMPERATURE
                        The value used to modulate the next token probabilities. (default: 0.95)
  --top_p TOP_P         The smallest set of most probable tokens with probabilities that add up to top_p or higher are kept. (default: 0.7)
  --top_k TOP_K         The number of highest probability vocabulary tokens to keep for top-k filtering. (default: 50)
  --num_beams NUM_BEAMS
                        Number of beams for beam search. 1 means no beam search. (default: 1)
  --max_length MAX_LENGTH
                        The maximum length the generated tokens can have. It can be overridden by max_new_tokens. (default: 512)
  --max_new_tokens MAX_NEW_TOKENS
                        The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt. (default: 512)
  --repetition_penalty REPETITION_PENALTY
                        The parameter for repetition penalty. 1.0 means no penalty. (default: 1.0)
  --length_penalty LENGTH_PENALTY
                        Exponential penalty to the length that is used with beam-based generation. (default: 1.0)

调用api

# 旧接口
# python src/api_demo.py --model_name_or_path Qwen/Qwen1.5-0.5B-Chat --template default --finetuning_type lora --checkpoint_dir saves/Qwen1.5-0.5B-Chat/lora/test

python src/api_demo.py --model_name_or_path Qwen/Qwen1.5-0.5B-Chat --template default --finetuning_type lora --adapter_name_or_path saves/Qwen1.5-0.5B-Chat/lora/test

访问http://localhost:8000/docs
发送api请求

{
  "model": "string",
  "messages": [
    {
      "role": "user",
      "content": "你是谁"
    }
  ],
  "tools": [],
  "do_sample": true,
  "temperature": 0,
  "top_p": 0,
  "n": 1,
  "max_tokens": 0,
  "stream": false
}

在这里插入图片描述

问题

模型导出后如何运行

 python .\src\web_demo.py --model_name_or_path D:\LLaMA-Factory-main\qwentest01 --template qwen

希望能恢复lora权重文件能生成.bin格式，新版本是.safetensors，希望能加入参数控制，llama.cpp的qlora要bin格式
解决
训练的时候增加一个导出参数

--save_safetensors False
# python src/train_bash.py --stage sft --do_train True --model_name_or_path Qwen/Qwen1.5-0.5B-Chat --finetuning_type lora --template qwen --dataset_dir data  --dataset identity,alpaca_gpt4_zh --cutoff_len 1024 --learning_rate 0.0002 --num_train_epochs 5.0 --max_samples 500 --per_device_train_batch_size 4  --gradient_accumulation_steps 4 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 100 --warmup_steps 0 --optim adamw_torch --output_dir saves\Qwen1.5-0.5B-Chat\lora\testbin --fp16 True --lora_rank 8 --lora_alpha 16 --lora_dropout 0.1 --lora_target all --use_dora True --plot_loss True --save_safetensors False

py3.10+cuda12.1

pip freeze > requirements.txt

结果

accelerate==0.27.2
aiofiles==23.2.1
aiohttp==3.9.3
aiosignal==1.3.1
altair==5.3.0
annotated-types==0.6.0
anyio==4.3.0
async-timeout==4.0.3
attrs==23.2.0
bitsandbytes @ file:///D:/development/python/LLaMA-Factory-main/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl#sha256=472e7874de3a9237866951267721753dd4133bb4447a047a99a40f7e96b77e49
certifi==2024.2.2
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.6
contourpy==1.2.1
cycler==0.12.1
datasets==2.18.0
dill==0.3.7
docstring_parser==0.16
exceptiongroup==1.2.0
fastapi==0.110.1
ffmpy==0.3.2
filelock==3.13.3
fonttools==4.50.0
frozenlist==1.4.1
fsspec==2024.2.0
gradio==3.50.2
gradio_client==0.6.1
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
huggingface-hub==0.22.2
idna==3.6
importlib_resources==6.4.0
jieba==0.42.1
Jinja2==3.1.3
joblib==1.3.2
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
kiwisolver==1.4.5
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.8.3
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.15
networkx==3.2.1
nltk==3.8.1
numpy==1.26.4
orjson==3.10.0
packaging==24.0
pandas==2.2.1
peft==0.10.0
pillow==10.3.0
protobuf==5.26.1
psutil==5.9.8
pyarrow==15.0.2
pyarrow-hotfix==0.6
pydantic==2.6.4
pydantic_core==2.16.3
pydub==0.25.1
Pygments==2.17.2
pyparsing==3.1.2
python-dateutil==2.9.0.post0
python-multipart==0.0.9
pytz==2024.1
PyYAML==6.0.1
referencing==0.34.0
regex==2023.12.25
requests==2.31.0
rich==13.7.1
rouge-chinese==1.0.3
rpds-py==0.18.0
safetensors==0.4.2
scipy==1.13.0
semantic-version==2.10.0
sentencepiece==0.2.0
shtab==1.7.1
six==1.16.0
sniffio==1.3.1
sse-starlette==2.0.0
starlette==0.37.2
sympy==1.12
tiktoken==0.6.0
tokenizers==0.15.2
toolz==0.12.1
torch==2.1.0+cu121
torchaudio==2.1.0+cu121
torchvision==0.16.0+cu121
tqdm==4.66.2
transformers==4.37.2
trl==0.8.1
typing_extensions==4.10.0
tyro==0.7.3
tzdata==2024.1
urllib3==2.2.1
uvicorn==0.29.0
websockets==11.0.3
xxhash==3.4.1
yarl==1.9.4

白嫖手册
 参考
 参考
 参考
 ChatGLM2-6B
https://github.com/hiyouga/ChatGLM-Efficient-Tuning/tree/main
https://github.com/hiyouga/ChatGLM-Efficient-Tuning/blob/main/examples/alter_self_cognition.md
微调
 https://github.com/THUDM/ChatGLM-6B/tree/main/ptuning