Chatglm-4v-9b-lora微调日志实操+细节

最新推荐文章于 2024-10-18 12:27:51 发布

万理的探求者

最新推荐文章于 2024-10-18 12:27:51 发布

阅读量1.7k

点赞数 22

文章标签：语言模型 nlp 计算机视觉

本文链接：https://blog.csdn.net/2301_79474242/article/details/141393084

版权

GLM4 github地址

THUDM/GLM-4: GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型 (github.com)

GLM4模型地址

glm-4v-9b · 模型库 (modelscope.cn)

多模态能力

GLM-4V-9B 是一个多模态语言模型，具备视觉理解能力，其相关经典任务的评测结果如下：

	MMBench-EN-Test	MMBench-CN-Test	SEEDBench_IMG	MMStar	MMMU	MME	HallusionBench	AI2D	OCRBench
	英文综合	中文综合	综合能力	综合能力	学科综合	感知推理	幻觉性	图表理解	文字识别
GPT-4o, 20240513	83.4	82.1	77.1	63.9	69.2	2310.3	55	84.6	736
GPT-4v, 20240409	81	80.2	73	56	61.7	2070.2	43.9	78.6	656
GPT-4v, 20231106	77	74.4	72.3	49.7	53.8	1771.5	46.5	75.9	516
InternVL-Chat-V1.5	82.3	80.7	75.2	57.1	46.8	2189.6	47.4	80.6	720
LlaVA-Next-Yi-34B	81.1	79	75.7	51.6	48.8	2050.2	34.8	78.9	574
Step-1V	80.7	79.9	70.3	50	49.9	2206.4	48.4	79.2	625
MiniCPM-Llama3-V2.5	77.6	73.8	72.3	51.8	45.8	2024.6	42.4	78.4	725
Qwen-VL-Max	77.6	75.7	72.7	49.5	52	2281.7	41.2	75.7	684
GeminiProVision	73.6	74.3	70.7	38.6	49	2148.9	45.7	72.9	680
Claude-3V Opus	63.3	59.2	64	45.7	54.9	1586.8	37.8	70.6	694
GLM-4v-9B	81.1	79.4	76.8	58.7	47.2	2163.8	46.6

配置:

A800 80G

PyTorch 2.1.0

Python 3.10(ubuntu22.04)

Cuda 12.1

下载模型

pip install modelscope
执行这段代码下载
#模型下载
from modelscope import snapshot_download
model_dir = snapshot_download('ZhipuAI/glm-4v-9b',cache_dir='/root/autodl-tmp/')

下载框架

git clone https://github.com/THUDM/GLM-4.git

在框架里找到 basic_demo下的 requirements.txt 装包用清华源会快很多

pip install -r autodl-tmp/GLM-4/basic_demo/requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

安装 finetune_demo下的依赖在这一步要把里面的 peft包改成 12.00版本的

pip install -r autodl-tmp/GLM-4/finetune_demo/requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

构建数据集为jsonl格式(仅支持jsonl格式) 需要有图片图片得是绝对路径格式

{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0000.jpg"},{"role":"assistant","content":"图片中有一棵圣诞树站在人群中。"},{"role":"user","content":"图片中的圣诞树是什么颜色的？"},{"role":"assistant","content":"圣诞树是绿色的，上面装饰有各种颜色的装饰品。"}]}
{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0001.jpg"},{"role":"assistant","content":"图片中有一个白色的马桶和一个挂在墙上的淋浴喷头，还有几卷卫生纸挂在毛巾架上。"},{"role":"user","content":"图片中的马桶和淋浴喷头是什么颜色的？"},{"role":"assistant","content":"马桶和淋浴喷头都是白色的。"}]}
{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0002.jpg"},{"role":"assistant","content":"图片中有两个人坐在一头大象的背上。"},{"role":"user","content":"图片中的大象是什么颜色的？"},{"role":"assistant","content":"大象是灰色的。"}]}
{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0003.jpg"},{"role":"assistant","content":"图片中有一位年长的女性庆祝她的90岁生日，旁边有一个生日蛋糕。"},{"role":"user","content":"图片中的蛋糕是什么颜色的？"},{"role":"assistant","content":"蛋糕是白色的，可能还有其他装饰颜色。"}]}
{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0004.jpg"},{"role":"assistant","content":"图片中有一群人在高空中放风筝。"},{"role":"user","content":"图片中的风筝是什么颜色的？"},{"role":"assistant","content":"风筝的颜色多样，主要包括红色、蓝色和黄色等。"}]}
{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0005.jpg"},{"role":"assistant","content":"图片中是一个黑暗的公寓客厅，只有阳台上的节日灯光在照亮。"},{"role":"user","content":"图片中的灯光是什么颜色的？"},{"role":"assistant","content":"节日灯光是多彩的，包含了红色、绿色、黄色等颜色。"}]}
{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0006.jpg"},{"role":"assistant","content":"图片中有一堆香蕉，旁边有一张孩子的照片。"},{"role":"user","content":"图片中的香蕉是什么颜色的？"},{"role":"assistant","content":"香蕉是黄色的。"}]}
{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0007.jpg"},{"role":"assistant","content":"图片中有一个人把脚搭在桌子上，旁边有披萨和一台电脑。"},{"role":"user","content":"图片中的披萨是什么颜色的？"},{"role":"assistant","content":"披萨的颜色是棕色和金黄色，顶部有一些配料如奶酪和番茄酱。"}]}
{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0008.jpg"},{"role":"assistant","content":"图片中有一辆红色的越野摩托车停在支架上。"},{"role":"user","content":"图片中的摩托车是什么颜色的？"},{"role":"assistant","content":"摩托车是红色的。"}]}
{"messages":[{"role":"user","content":"图片中有什么？","image": "/root/GLM-4/图片存放/0009.jpg"},{"role":"assistant","content":"图片中有一个穿橙色T恤的年轻男子正在准备击打网球。"},{"role":"user","content":"图片中的网球是什么颜色的？"},{"role":"assistant","content":"网球是黄色的。"}]}
{"messages": [{"role": "user", "content": "图片中有什么？", "image":"/root/GLM-4/图片存放/0010.jpg"}, {"role": "assistant", "content": "图片中有一只红色的消防栓被雪覆盖和围绕。"}, {"role": "user", "content": "图片中的物体是什么颜色的？"}, {"role": "assistant", "content": "消防栓是红色的，雪是白色的。"}]}
{"messages": [{"role": "user", "content": "图片中有什么？", "image": "/root/GLM-4/图片存放/0011.jpg"}, {"role": "assistant", "content": "图片中有一辆装满家电和家具的旧卡车在街上行驶。"}, {"role": "user", "content": "图片中的物体是什么颜色的？"}, {"role": "assistant", "content": "卡车是红色的，街道是灰色的。"}]}
{"messages": [{"role": "user", "content": "图片中有什么？", "image": "/root/GLM-4/图片存放/0012.jpg"}, {"role": "assistant", "content": "图片中有一名戴眼镜的男子，靠着拳头支撑着头部。"}, {"role": "user", "content": "图片中的物体是什么颜色的？"}, {"role": "assistant", "content": "男子穿着蓝色的衬衫，背景是模糊的，主要颜色是灰色的。"}]}
{"messages": [{"role": "user", "content": "图片中有什么？", "image": "/root/GLM-4/图片存放/0013.jpg"}, {"role": "assistant", "content": "图片中有两个男人站在田野里。"}, {"role": "user", "content": "图片中的物体是什么颜色的？"}, {"role": "assistant", "content": "两个人穿着深色的衣服，背景是绿色的草地。"}]}

运行微调指令微调参数在lora.yaml 里面修改参数为 data路径模型路径 config路径

python finetune_vision.py  data/  /root/autodl-tmp/ZhipuAI/glm-4v-9b configs/lora.yaml

微调显存占用大约在56-28之间浮动按官方给出的loar - 大概是在75g

训练日志 300次耗时 15分钟:

root@autodl-container-e94b4883e6-1f32b64b:~/autodl-tmp/GLM-4/finetune_demo# python finetune_vision.py  data/  /root/autodl-tmp/ZhipuAI/glm-4v-9b configs/lora.yaml 
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:01<00:00,  8.37it/s]
trainable params: 6,397,952 || all params: 13,912,718,848 || trainable%: 0.0460
train_dataset: Dataset({
    features: ['input_ids', 'attention_mask', 'position_ids', 'labels', 'images'],
    num_rows: 10
})
val_dataset: Dataset({
    features: ['input_ids', 'attention_mask', 'position_ids', 'output_ids', 'images'],
    num_rows: 12
})
test_dataset: Dataset({
    features: ['input_ids', 'attention_mask', 'position_ids', 'output_ids', 'images'],
    num_rows: 12
})
Detected kernel version 4.19.90, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
max_steps is given, it will override any value given in num_train_epochs
[2024-08-21 13:30:56,630] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
 [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.4
 [WARNING]  using untested triton version (3.0.0), only 1.0.0 is known to be compatible
/root/miniconda3/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:49: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
  def forward(ctx, input, weight, bias=None):
/root/miniconda3/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:67: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
  def backward(ctx, grad_output):
***** Running training *****
  Num examples = 10
  Num Epochs = 31
  Instantaneous batch size per device = 1
  Total train batch size (w. parallel, distributed & accumulation) = 1
  Gradient Accumulation steps = 1
  Total optimization steps = 301
  Number of trainable parameters = 6,397,952
  0%|                                                                                                                                                   | 0/301 [00:00<?, ?it/s]/root/miniconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead.
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
{'loss': 3.5938, 'grad_norm': 10.733323097229004, 'learning_rate': 0.0004833887043189369, 'epoch': 1.0}                                                                         
{'loss': 1.8109, 'grad_norm': 4.4661383628845215, 'learning_rate': 0.0004667774086378738, 'epoch': 2.0}                                                                         
{'loss': 1.1641, 'grad_norm': 7.350773811340332, 'learning_rate': 0.0004501661129568106, 'epoch': 3.0}                                                                          
{'loss': 0.8066, 'grad_norm': 8.626984596252441, 'learning_rate': 0.0004335548172757475, 'epoch': 4.0}                                                                          
{'loss': 0.5965, 'grad_norm': 1.2147183418273926, 'learning_rate': 0.0004169435215946844, 'epoch': 5.0}                                                                         
{'loss': 0.575, 'grad_norm': 6.869624137878418, 'learning_rate': 0.0004003322259136213, 'epoch': 6.0}                                                                           
{'loss': 0.6049, 'grad_norm': 5.884566783905029, 'learning_rate': 0.0003837209302325582, 'epoch': 7.0}                                                                          
{'loss': 0.5574, 'grad_norm': 0.1620490849018097, 'learning_rate': 0.000367109634551495, 'epoch': 8.0}                                                                          
{'loss': 0.5186, 'grad_norm': 1.8908841609954834, 'learning_rate': 0.0003504983388704319, 'epoch': 9.0}                                                                         
{'loss': 0.5385, 'grad_norm': 0.09366925805807114, 'learning_rate': 0.0003338870431893688, 'epoch': 10.0}                                                                       
 33%|█████████████████████████████████████████████▌                                                                                           | 100/301 [04:56<09:00,  2.69s/it]Saving model checkpoint to ./output/checkpoint-100
loading configuration file /root/autodl-tmp/ZhipuAI/glm-4v-9b/config.json
Model config ChatGLMConfig {
  "_name_or_path": "THUDM/glm-4v-9b",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
  },
  "bias_dropout_fusion": true,
  "boi_token_id": 151339,
  "classifier_dropout": null,
  "eoi_token_id": 151340,
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1.5625e-07,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_layers": 40,
  "original_rope": true,
  "pad_token_id": 151329,
  "padded_vocab_size": 151552,
  "post_layer_norm": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "rmsnorm": true,
  "rope_ratio": 1,
  "seq_length": 8192,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.44.0",
  "use_cache": true,
  "vision_config": {
    "dropout_prob": 0.0,
    "hidden_act": "gelu",
    "hidden_size": 1792,
    "image_size": 1120,
    "in_channels": 3,
    "intermediate_size": 15360,
    "layer_norm_eps": 1e-06,
    "num_heads": 16,
    "num_hidden_layers": 63,
    "num_positions": 6401,
    "patch_size": 14,
    "scaling_factor": 8
  },
  "vocab_size": 151552
}

/root/miniconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead.
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
{'loss': 0.5539, 'grad_norm': 0.14932133257389069, 'learning_rate': 0.0003172757475083057, 'epoch': 11.0}                                                                       
{'loss': 0.5088, 'grad_norm': 1.0388888120651245, 'learning_rate': 0.0003006644518272426, 'epoch': 12.0}                                                                        
{'loss': 0.5055, 'grad_norm': 0.0503326915204525, 'learning_rate': 0.00028405315614617936, 'epoch': 13.0}                                                                       
{'loss': 0.5043, 'grad_norm': 0.04505591839551926, 'learning_rate': 0.00026744186046511625, 'epoch': 14.0}                                                                      
{'loss': 0.5035, 'grad_norm': 0.04017101600766182, 'learning_rate': 0.00025083056478405314, 'epoch': 15.0}                                                                      
{'loss': 0.5025, 'grad_norm': 0.0547008141875267, 'learning_rate': 0.00023421926910299006, 'epoch': 16.0}                                                                       
{'loss': 0.5025, 'grad_norm': 0.024539534002542496, 'learning_rate': 0.00021760797342192692, 'epoch': 17.0}                                                                     
{'loss': 0.5018, 'grad_norm': 0.027166176587343216, 'learning_rate': 0.0002009966777408638, 'epoch': 18.0}                                                                      
{'loss': 0.5016, 'grad_norm': 0.025752652436494827, 'learning_rate': 0.00018438538205980064, 'epoch': 19.0}                                                                     
{'loss': 0.5014, 'grad_norm': 0.021403193473815918, 'learning_rate': 0.00016777408637873753, 'epoch': 20.0}                                                                     
 66%|███████████████████████████████████████████████████████████████████████████████████████████                                              | 200/301 [09:48<04:33,  2.71s/it]Saving model checkpoint to ./output/checkpoint-200
loading configuration file /root/autodl-tmp/ZhipuAI/glm-4v-9b/config.json
Model config ChatGLMConfig {
  "_name_or_path": "THUDM/glm-4v-9b",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
  },
  "bias_dropout_fusion": true,
  "boi_token_id": 151339,
  "classifier_dropout": null,
  "eoi_token_id": 151340,
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1.5625e-07,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_layers": 40,
  "original_rope": true,
  "pad_token_id": 151329,
  "padded_vocab_size": 151552,
  "post_layer_norm": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "rmsnorm": true,
  "rope_ratio": 1,
  "seq_length": 8192,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.44.0",
  "use_cache": true,
  "vision_config": {
    "dropout_prob": 0.0,
    "hidden_act": "gelu",
    "hidden_size": 1792,
    "image_size": 1120,
    "in_channels": 3,
    "intermediate_size": 15360,
    "layer_norm_eps": 1e-06,
    "num_heads": 16,
    "num_hidden_layers": 63,
    "num_positions": 6401,
    "patch_size": 14,
    "scaling_factor": 8
  },
  "vocab_size": 151552
}

/root/miniconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead.
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
{'loss': 0.5016, 'grad_norm': 0.053727854043245316, 'learning_rate': 0.00015116279069767442, 'epoch': 21.0}                                                                     
{'loss': 0.5014, 'grad_norm': 0.02131590060889721, 'learning_rate': 0.00013455149501661129, 'epoch': 22.0}                                                                      
{'loss': 0.5012, 'grad_norm': 0.038959991186857224, 'learning_rate': 0.00011794019933554818, 'epoch': 23.0}                                                                     
{'loss': 0.5012, 'grad_norm': 0.03352357819676399, 'learning_rate': 0.00010132890365448505, 'epoch': 24.0}                                                                      
{'loss': 0.5008, 'grad_norm': 0.037203721702098846, 'learning_rate': 8.471760797342193e-05, 'epoch': 25.0}                                                                      
{'loss': 0.5006, 'grad_norm': 0.01759534515440464, 'learning_rate': 6.81063122923588e-05, 'epoch': 26.0}                                                                        
{'loss': 0.501, 'grad_norm': 0.019959311932325363, 'learning_rate': 5.149501661129568e-05, 'epoch': 27.0}                                                                       
{'loss': 0.5004, 'grad_norm': 0.017147695645689964, 'learning_rate': 3.4883720930232556e-05, 'epoch': 28.0}                                                                     
{'loss': 0.5006, 'grad_norm': 0.01920023001730442, 'learning_rate': 1.827242524916944e-05, 'epoch': 29.0}                                                                       
{'loss': 0.5006, 'grad_norm': 0.024577999487519264, 'learning_rate': 1.6611295681063123e-06, 'epoch': 30.0}                                                                     
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌| 300/301 [14:41<00:02,  2.71s/it]Saving model checkpoint to ./output/checkpoint-300
loading configuration file /root/autodl-tmp/ZhipuAI/glm-4v-9b/config.json
Model config ChatGLMConfig {
  "_name_or_path": "THUDM/glm-4v-9b",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
  },
  "bias_dropout_fusion": true,
  "boi_token_id": 151339,
  "classifier_dropout": null,
  "eoi_token_id": 151340,
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1.5625e-07,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_layers": 40,
  "original_rope": true,
  "pad_token_id": 151329,
  "padded_vocab_size": 151552,
  "post_layer_norm": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "rmsnorm": true,
  "rope_ratio": 1,
  "seq_length": 8192,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.44.0",
  "use_cache": true,
  "vision_config": {
    "dropout_prob": 0.0,
    "hidden_act": "gelu",
    "hidden_size": 1792,
    "image_size": 1120,
    "in_channels": 3,
    "intermediate_size": 15360,
    "layer_norm_eps": 1e-06,
    "num_heads": 16,
    "num_hidden_layers": 63,
    "num_positions": 6401,
    "patch_size": 14,
    "scaling_factor": 8
  },
  "vocab_size": 151552
}

/root/miniconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead.
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 301/301 [14:46<00:00,  3.51s/it]Saving model checkpoint to ./output/checkpoint-301
loading configuration file /root/autodl-tmp/ZhipuAI/glm-4v-9b/config.json
Model config ChatGLMConfig {
  "_name_or_path": "THUDM/glm-4v-9b",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
  },
  "bias_dropout_fusion": true,
  "boi_token_id": 151339,
  "classifier_dropout": null,
  "eoi_token_id": 151340,
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1.5625e-07,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_layers": 40,
  "original_rope": true,
  "pad_token_id": 151329,
  "padded_vocab_size": 151552,
  "post_layer_norm": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "rmsnorm": true,
  "rope_ratio": 1,
  "seq_length": 8192,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.44.0",
  "use_cache": true,
  "vision_config": {
    "dropout_prob": 0.0,
    "hidden_act": "gelu",
    "hidden_size": 1792,
    "image_size": 1120,
    "in_channels": 3,
    "intermediate_size": 15360,
    "layer_norm_eps": 1e-06,
    "num_heads": 16,
    "num_hidden_layers": 63,
    "num_positions": 6401,
    "patch_size": 14,
    "scaling_factor": 8
  },
  "vocab_size": 151552
}



Training completed. Do not forget to share your model on huggingface.co/models =)


{'train_runtime': 887.0465, 'train_samples_per_second': 0.339, 'train_steps_per_second': 0.339, 'train_loss': 0.6946830876245847, 'epoch': 30.1}                                
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 301/301 [14:47<00:00,  2.95s/it]

***** Running Prediction *****
  Num examples = 12
  Batch size = 4
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:44<00:00, 15.75s/it]Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.683 seconds.
Prefix dict has been built successfully.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:45<00:00, 15.28s/it]
r

推理测试

python inference.py output/checkpoint-301

Chatglm-4v-9b-lora微调 日志 实操+细节

GLM4 github地址

GLM4模型地址

多模态能力

配置:

下载模型

下载框架

在框架里找到 basic_demo下 的 requirements.txt 装包 用清华源会快很多

安装 finetune_demo下的依赖 在这一步要把里面的 peft包 改成 12.00版本的

构建数据集 为jsonl格式(仅支持jsonl格式) 需要有图片 图片得是绝对路径格式

运行微调指令 微调参数在lora.yaml 里面修改 参数为 data路径 模型路径 config路径

微调显存占用大约在56-28之间浮动 按官方给出的loar - 大概是在75g