Chatglm-4v-9b-lora微调 日志 实操+细节

GLM4 github地址

THUDM/GLM-4: GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型 (github.com)

GLM4模型地址

glm-4v-9b · 模型库 (modelscope.cn)

多模态能力

GLM-4V-9B 是一个多模态语言模型,具备视觉理解能力,其相关经典任务的评测结果如下:

MMBench-EN-TestMMBench-CN-TestSEEDBench_IMGMMStarMMMUMMEHallusionBenchAI2DOCRBench
英文综合中文综合综合能力综合能力学科综合感知推理幻觉性图表理解文字识别
GPT-4o, 2024051383.482.177.163.969.22310.35584.6736
GPT-4v, 202404098180.2735661.72070.243.978.6656
GPT-4v, 202311067774.472.349.753.81771.546.575.9516
InternVL-Chat-V1.582.380.775.257.146.82189.647.480.6720
LlaVA-Next-Yi-34B81.17975.751.648.82050.234.878.9574
Step-1V80.779.970.35049.92206.448.479.2625
MiniCPM-Llama3-V2.577.673.872.351.845.82024.642.478.4725
Qwen-VL-Max77.675.772.749.5522281.741.275.7684
GeminiProVision73.674.370.738.6492148.945.772.9680
Claude-3V Opus63.359.26445.754.91586.837.870.6694
GLM-4v-9B81.179.476.858.747.22163.846.6

配置:

A800 80G
​
PyTorch 2.1.0
​
Python 3.10(ubuntu22.04)
​
Cuda 12.1

下载模型

pip install modelscope
执行这段代码下载
#模型下载
from modelscope import snapshot_download
model_dir = snapshot_download('ZhipuAI/glm-4v-9b',cache_dir='/root/autodl-tmp/')

下载框架

git clone https://github.com/THUDM/GLM-4.git

在框架里找到 basic_demo下 的 requirements.txt 装包 用清华源会快很多

pip install -r autodl-tmp/GLM-4/basic_demo/requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

安装 finetune_demo下的依赖 在这一步要把里面的 peft包 改成 12.00版本的

pip install -r autodl-tmp/GLM-4/finetune_demo/requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

构建数据集 为jsonl格式(仅支持jsonl格式) 需要有图片 图片得是绝对路径格式

{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0000.jpg"},{"role":"assistant","content":"图片中有一棵圣诞树站在人群中。"},{"role":"user","content":"图片中的圣诞树是什么颜色的?"},{"role":"assistant","content":"圣诞树是绿色的,上面装饰有各种颜色的装饰品。"}]}
{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0001.jpg"},{"role":"assistant","content":"图片中有一个白色的马桶和一个挂在墙上的淋浴喷头,还有几卷卫生纸挂在毛巾架上。"},{"role":"user","content":"图片中的马桶和淋浴喷头是什么颜色的?"},{"role":"assistant","content":"马桶和淋浴喷头都是白色的。"}]}
{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0002.jpg"},{"role":"assistant","content":"图片中有两个人坐在一头大象的背上。"},{"role":"user","content":"图片中的大象是什么颜色的?"},{"role":"assistant","content":"大象是灰色的。"}]}
{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0003.jpg"},{"role":"assistant","content":"图片中有一位年长的女性庆祝她的90岁生日,旁边有一个生日蛋糕。"},{"role":"user","content":"图片中的蛋糕是什么颜色的?"},{"role":"assistant","content":"蛋糕是白色的,可能还有其他装饰颜色。"}]}
{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0004.jpg"},{"role":"assistant","content":"图片中有一群人在高空中放风筝。"},{"role":"user","content":"图片中的风筝是什么颜色的?"},{"role":"assistant","content":"风筝的颜色多样,主要包括红色、蓝色和黄色等。"}]}
{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0005.jpg"},{"role":"assistant","content":"图片中是一个黑暗的公寓客厅,只有阳台上的节日灯光在照亮。"},{"role":"user","content":"图片中的灯光是什么颜色的?"},{"role":"assistant","content":"节日灯光是多彩的,包含了红色、绿色、黄色等颜色。"}]}
{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0006.jpg"},{"role":"assistant","content":"图片中有一堆香蕉,旁边有一张孩子的照片。"},{"role":"user","content":"图片中的香蕉是什么颜色的?"},{"role":"assistant","content":"香蕉是黄色的。"}]}
{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0007.jpg"},{"role":"assistant","content":"图片中有一个人把脚搭在桌子上,旁边有披萨和一台电脑。"},{"role":"user","content":"图片中的披萨是什么颜色的?"},{"role":"assistant","content":"披萨的颜色是棕色和金黄色,顶部有一些配料如奶酪和番茄酱。"}]}
{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0008.jpg"},{"role":"assistant","content":"图片中有一辆红色的越野摩托车停在支架上。"},{"role":"user","content":"图片中的摩托车是什么颜色的?"},{"role":"assistant","content":"摩托车是红色的。"}]}
{"messages":[{"role":"user","content":"图片中有什么?","image": "/root/GLM-4/图片存放/0009.jpg"},{"role":"assistant","content":"图片中有一个穿橙色T恤的年轻男子正在准备击打网球。"},{"role":"user","content":"图片中的网球是什么颜色的?"},{"role":"assistant","content":"网球是黄色的。"}]}
{"messages": [{"role": "user", "content": "图片中有什么?", "image":"/root/GLM-4/图片存放/0010.jpg"}, {"role": "assistant", "content": "图片中有一只红色的消防栓被雪覆盖和围绕。"}, {"role": "user", "content": "图片中的物体是什么颜色的?"}, {"role": "assistant", "content": "消防栓是红色的,雪是白色的。"}]}
{"messages": [{"role": "user", "content": "图片中有什么?", "image": "/root/GLM-4/图片存放/0011.jpg"}, {"role": "assistant", "content": "图片中有一辆装满家电和家具的旧卡车在街上行驶。"}, {"role": "user", "content": "图片中的物体是什么颜色的?"}, {"role": "assistant", "content": "卡车是红色的,街道是灰色的。"}]}
{"messages": [{"role": "user", "content": "图片中有什么?", "image": "/root/GLM-4/图片存放/0012.jpg"}, {"role": "assistant", "content": "图片中有一名戴眼镜的男子,靠着拳头支撑着头部。"}, {"role": "user", "content": "图片中的物体是什么颜色的?"}, {"role": "assistant", "content": "男子穿着蓝色的衬衫,背景是模糊的,主要颜色是灰色的。"}]}
{"messages": [{"role": "user", "content": "图片中有什么?", "image": "/root/GLM-4/图片存放/0013.jpg"}, {"role": "assistant", "content": "图片中有两个男人站在田野里。"}, {"role": "user", "content": "图片中的物体是什么颜色的?"}, {"role": "assistant", "content": "两个人穿着深色的衣服,背景是绿色的草地。"}]}

运行微调指令 微调参数在lora.yaml 里面修改 参数为 data路径 模型路径 config路径

python finetune_vision.py  data/  /root/autodl-tmp/ZhipuAI/glm-4v-9b configs/lora.yaml 
​

微调显存占用大约在56-28之间浮动 按官方给出的loar - 大概是在75g

训练日志 300次耗时 15分钟:

root@autodl-container-e94b4883e6-1f32b64b:~/autodl-tmp/GLM-4/finetune_demo# python finetune_vision.py  data/  /root/autodl-tmp/ZhipuAI/glm-4v-9b configs/lora.yaml 
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:01<00:00,  8.37it/s]
trainable params: 6,397,952 || all params: 13,912,718,848 || trainable%: 0.0460
train_dataset: Dataset({
    features: ['input_ids', 'attention_mask', 'position_ids', 'labels', 'images'],
    num_rows: 10
})
val_dataset: Dataset({
    features: ['input_ids', 'attention_mask', 'position_ids', 'output_ids', 'images'],
    num_rows: 12
})
test_dataset: Dataset({
    features: ['input_ids', 'attention_mask', 'position_ids', 'output_ids', 'images'],
    num_rows: 12
})
Detected kernel version 4.19.90, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
max_steps is given, it will override any value given in num_train_epochs
[2024-08-21 13:30:56,630] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
 [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.4
 [WARNING]  using untested triton version (3.0.0), only 1.0.0 is known to be compatible
/root/miniconda3/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:49: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
  def forward(ctx, input, weight, bias=None):
/root/miniconda3/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:67: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
  def backward(ctx, grad_output):
***** Running training *****
  Num examples = 10
  Num Epochs = 31
  Instantaneous batch size per device = 1
  Total train batch size (w. parallel, distributed & accumulation) = 1
  Gradient Accumulation steps = 1
  Total optimization steps = 301
  Number of trainable parameters = 6,397,952
  0%|                                                                                                                                                   | 0/301 [00:00<?, ?it/s]/root/miniconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead.
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
{'loss': 3.5938, 'grad_norm': 10.733323097229004, 'learning_rate': 0.0004833887043189369, 'epoch': 1.0}                                                                         
{'loss': 1.8109, 'grad_norm': 4.4661383628845215, 'learning_rate': 0.0004667774086378738, 'epoch': 2.0}                                                                         
{'loss': 1.1641, 'grad_norm': 7.350773811340332, 'learning_rate': 0.0004501661129568106, 'epoch': 3.0}                                                                          
{'loss': 0.8066, 'grad_norm': 8.626984596252441, 'learning_rate': 0.0004335548172757475, 'epoch': 4.0}                                                                          
{'loss': 0.5965, 'grad_norm': 1.2147183418273926, 'learning_rate': 0.0004169435215946844, 'epoch': 5.0}                                                                         
{'loss': 0.575, 'grad_norm': 6.869624137878418, 'learning_rate': 0.0004003322259136213, 'epoch': 6.0}                                                                           
{'loss': 0.6049, 'grad_norm': 5.884566783905029, 'learning_rate': 0.0003837209302325582, 'epoch': 7.0}                                                                          
{'loss': 0.5574, 'grad_norm': 0.1620490849018097, 'learning_rate': 0.000367109634551495, 'epoch': 8.0}                                                                          
{'loss': 0.5186, 'grad_norm': 1.8908841609954834, 'learning_rate': 0.0003504983388704319, 'epoch': 9.0}                                                                         
{'loss': 0.5385, 'grad_norm': 0.09366925805807114, 'learning_rate': 0.0003338870431893688, 'epoch': 10.0}                                                                       
 33%|█████████████████████████████████████████████▌                                                                                           | 100/301 [04:56<09:00,  2.69s/it]Saving model checkpoint to ./output/checkpoint-100
loading configuration file /root/autodl-tmp/ZhipuAI/glm-4v-9b/config.json
Model config ChatGLMConfig {
  "_name_or_path": "THUDM/glm-4v-9b",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
  },
  "bias_dropout_fusion": true,
  "boi_token_id": 151339,
  "classifier_dropout": null,
  "eoi_token_id": 151340,
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1.5625e-07,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_layers": 40,
  "original_rope": true,
  "pad_token_id": 151329,
  "padded_vocab_size": 151552,
  "post_layer_norm": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "rmsnorm": true,
  "rope_ratio": 1,
  "seq_length": 8192,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.44.0",
  "use_cache": true,
  "vision_config": {
    "dropout_prob": 0.0,
    "hidden_act": "gelu",
    "hidden_size": 1792,
    "image_size": 1120,
    "in_channels": 3,
    "intermediate_size": 15360,
    "layer_norm_eps": 1e-06,
    "num_heads": 16,
    "num_hidden_layers": 63,
    "num_positions": 6401,
    "patch_size": 14,
    "scaling_factor": 8
  },
  "vocab_size": 151552
}
​
/root/miniconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead.
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
{'loss': 0.5539, 'grad_norm': 0.14932133257389069, 'learning_rate': 0.0003172757475083057, 'epoch': 11.0}                                                                       
{'loss': 0.5088, 'grad_norm': 1.0388888120651245, 'learning_rate': 0.0003006644518272426, 'epoch': 12.0}                                                                        
{'loss': 0.5055, 'grad_norm': 0.0503326915204525, 'learning_rate': 0.00028405315614617936, 'epoch': 13.0}                                                                       
{'loss': 0.5043, 'grad_norm': 0.04505591839551926, 'learning_rate': 0.00026744186046511625, 'epoch': 14.0}                                                                      
{'loss': 0.5035, 'grad_norm': 0.04017101600766182, 'learning_rate': 0.00025083056478405314, 'epoch': 15.0}                                                                      
{'loss': 0.5025, 'grad_norm': 0.0547008141875267, 'learning_rate': 0.00023421926910299006, 'epoch': 16.0}                                                                       
{'loss': 0.5025, 'grad_norm': 0.024539534002542496, 'learning_rate': 0.00021760797342192692, 'epoch': 17.0}                                                                     
{'loss': 0.5018, 'grad_norm': 0.027166176587343216, 'learning_rate': 0.0002009966777408638, 'epoch': 18.0}                                                                      
{'loss': 0.5016, 'grad_norm': 0.025752652436494827, 'learning_rate': 0.00018438538205980064, 'epoch': 19.0}                                                                     
{'loss': 0.5014, 'grad_norm': 0.021403193473815918, 'learning_rate': 0.00016777408637873753, 'epoch': 20.0}                                                                     
 66%|███████████████████████████████████████████████████████████████████████████████████████████                                              | 200/301 [09:48<04:33,  2.71s/it]Saving model checkpoint to ./output/checkpoint-200
loading configuration file /root/autodl-tmp/ZhipuAI/glm-4v-9b/config.json
Model config ChatGLMConfig {
  "_name_or_path": "THUDM/glm-4v-9b",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
  },
  "bias_dropout_fusion": true,
  "boi_token_id": 151339,
  "classifier_dropout": null,
  "eoi_token_id": 151340,
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1.5625e-07,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_layers": 40,
  "original_rope": true,
  "pad_token_id": 151329,
  "padded_vocab_size": 151552,
  "post_layer_norm": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "rmsnorm": true,
  "rope_ratio": 1,
  "seq_length": 8192,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.44.0",
  "use_cache": true,
  "vision_config": {
    "dropout_prob": 0.0,
    "hidden_act": "gelu",
    "hidden_size": 1792,
    "image_size": 1120,
    "in_channels": 3,
    "intermediate_size": 15360,
    "layer_norm_eps": 1e-06,
    "num_heads": 16,
    "num_hidden_layers": 63,
    "num_positions": 6401,
    "patch_size": 14,
    "scaling_factor": 8
  },
  "vocab_size": 151552
}
​
/root/miniconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead.
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
{'loss': 0.5016, 'grad_norm': 0.053727854043245316, 'learning_rate': 0.00015116279069767442, 'epoch': 21.0}                                                                     
{'loss': 0.5014, 'grad_norm': 0.02131590060889721, 'learning_rate': 0.00013455149501661129, 'epoch': 22.0}                                                                      
{'loss': 0.5012, 'grad_norm': 0.038959991186857224, 'learning_rate': 0.00011794019933554818, 'epoch': 23.0}                                                                     
{'loss': 0.5012, 'grad_norm': 0.03352357819676399, 'learning_rate': 0.00010132890365448505, 'epoch': 24.0}                                                                      
{'loss': 0.5008, 'grad_norm': 0.037203721702098846, 'learning_rate': 8.471760797342193e-05, 'epoch': 25.0}                                                                      
{'loss': 0.5006, 'grad_norm': 0.01759534515440464, 'learning_rate': 6.81063122923588e-05, 'epoch': 26.0}                                                                        
{'loss': 0.501, 'grad_norm': 0.019959311932325363, 'learning_rate': 5.149501661129568e-05, 'epoch': 27.0}                                                                       
{'loss': 0.5004, 'grad_norm': 0.017147695645689964, 'learning_rate': 3.4883720930232556e-05, 'epoch': 28.0}                                                                     
{'loss': 0.5006, 'grad_norm': 0.01920023001730442, 'learning_rate': 1.827242524916944e-05, 'epoch': 29.0}                                                                       
{'loss': 0.5006, 'grad_norm': 0.024577999487519264, 'learning_rate': 1.6611295681063123e-06, 'epoch': 30.0}                                                                     
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌| 300/301 [14:41<00:02,  2.71s/it]Saving model checkpoint to ./output/checkpoint-300
loading configuration file /root/autodl-tmp/ZhipuAI/glm-4v-9b/config.json
Model config ChatGLMConfig {
  "_name_or_path": "THUDM/glm-4v-9b",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
  },
  "bias_dropout_fusion": true,
  "boi_token_id": 151339,
  "classifier_dropout": null,
  "eoi_token_id": 151340,
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1.5625e-07,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_layers": 40,
  "original_rope": true,
  "pad_token_id": 151329,
  "padded_vocab_size": 151552,
  "post_layer_norm": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "rmsnorm": true,
  "rope_ratio": 1,
  "seq_length": 8192,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.44.0",
  "use_cache": true,
  "vision_config": {
    "dropout_prob": 0.0,
    "hidden_act": "gelu",
    "hidden_size": 1792,
    "image_size": 1120,
    "in_channels": 3,
    "intermediate_size": 15360,
    "layer_norm_eps": 1e-06,
    "num_heads": 16,
    "num_hidden_layers": 63,
    "num_positions": 6401,
    "patch_size": 14,
    "scaling_factor": 8
  },
  "vocab_size": 151552
}
​
/root/miniconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead.
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 301/301 [14:46<00:00,  3.51s/it]Saving model checkpoint to ./output/checkpoint-301
loading configuration file /root/autodl-tmp/ZhipuAI/glm-4v-9b/config.json
Model config ChatGLMConfig {
  "_name_or_path": "THUDM/glm-4v-9b",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
  },
  "bias_dropout_fusion": true,
  "boi_token_id": 151339,
  "classifier_dropout": null,
  "eoi_token_id": 151340,
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1.5625e-07,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_layers": 40,
  "original_rope": true,
  "pad_token_id": 151329,
  "padded_vocab_size": 151552,
  "post_layer_norm": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "rmsnorm": true,
  "rope_ratio": 1,
  "seq_length": 8192,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.44.0",
  "use_cache": true,
  "vision_config": {
    "dropout_prob": 0.0,
    "hidden_act": "gelu",
    "hidden_size": 1792,
    "image_size": 1120,
    "in_channels": 3,
    "intermediate_size": 15360,
    "layer_norm_eps": 1e-06,
    "num_heads": 16,
    "num_hidden_layers": 63,
    "num_positions": 6401,
    "patch_size": 14,
    "scaling_factor": 8
  },
  "vocab_size": 151552
}
​
​
​
Training completed. Do not forget to share your model on huggingface.co/models =)
​
​
{'train_runtime': 887.0465, 'train_samples_per_second': 0.339, 'train_steps_per_second': 0.339, 'train_loss': 0.6946830876245847, 'epoch': 30.1}                                
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 301/301 [14:47<00:00,  2.95s/it]
​
***** Running Prediction *****
  Num examples = 12
  Batch size = 4
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:44<00:00, 15.75s/it]Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.683 seconds.
Prefix dict has been built successfully.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:45<00:00, 15.28s/it]
r

推理测试

python inference.py output/checkpoint-301

推理显存占用大约21g

  • 11
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
要对 chatglm2-6b 进行微调以适应 LORA(低功耗广域网)环境,可以按照以下步骤进行操作: 1. 首先,执行 chatglm2_6b_lora.py 脚本来启动微调过程。可以使用命令行输入以下命令:`python chatglm2_6b_lora.py`。 2. 接下来,修改配置文件中的几个值。可以使用以下命令在配置文件中替换一些字符串: ``` sed -i 's/THUDM\/chatglm2-6b--//g' ./chatglm2_6b_lora/config.json sed -i 's/THUDM\/chatglm2-6b/\.\.\/chatglm2_6b_lora/g' ./chatglm2_6b_lora/config.json sed -i 's/THUDM\/chatglm2-6b--//g' ./chatglm2_6b_lora/tokenizer_config.json sed -i 's/THUDM\/ChatGLM2-6B/\.\.\/chatglm2_6b_lora/g' ./chatglm2_6b_lora/modeling_chatglm.py sed -i 's/THUDM\/chatglm2-6b/\.\.\/chatglm2_6b_lora/g' ./chatglm2_6b_lora/modeling_chatglm.py ``` 这些命令将修改配置文件中的一些路径,以使其适应 LORA 微调环境。 3. 最后,调用导出的模型进行微调。具体操作可以参考 ChatGLM-Efficient-Tuning 项目的 README 文件,首先克隆该项目的仓库,然后创建并激活一个新的 Python 环境,安装所需的依赖,并根据提供的数据集说明进行微调。可以使用以下命令执行这一步骤: ``` git clone https://github.com/hiyouga/ChatGLM-Efficient-Tuning.git conda create -n chatglm_etuning python=3.10 conda activate chatglm_etuning cd ChatGLM-Efficient-Tuning pip install -r requirements.txt ``` 然后,根据项目提供的数据集说明进行微调。 通过按照上述步骤进行操作,您可以对 chatglm2-6b 进行 LORA 微调。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *2* *3* [修改 ChatGLM2-6B 自我认知的 Lora 微调教程](https://blog.csdn.net/engchina/article/details/131492403)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 100%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值