大模型微调报错二

最新推荐文章于 2024-07-19 16:50:04 发布

reesn

最新推荐文章于 2024-07-19 16:50:04 发布

阅读量708

点赞数 2

分类专栏：神经网络文章标签：深度学习人工智能

本文链接：https://blog.csdn.net/rstroller/article/details/136789129

版权

神经网络专栏收录该内容

2 篇文章 0 订阅

订阅专栏

训练大模型Qwen15-05B-Chat-GPTQ-Int4
训练使用qwen1.5 sft：
命令：python finetune.py --model_name_or_path /llm/Qwen15-05B-Chat-GPTQ-Int4
–output_dir ./checkpoints
–model_max_length 512
–data_path /data/agi/dataset/train_0.5M_CN/output600.jsonl
–use_lora True
–per_device_train_batch_size 1
–q_lora True
–learning_rate 5e-4
运行报错：
ValueError: Found modules on cpu/disk. Using Exllama backend requires all the modules to be on GPU.You can deactivate exllama backend by setting disable_exllama=True in the quantization config object
处理：
1）修改finetune.py。

    model = AutoModelForCausalLM.from_pretrained(
        model_args.model_name_or_path,
        config=config,
        cache_dir=training_args.cache_dir,
        device_map=device_map,
        quantization_config=GPTQConfig(
            bits=4,
	    disable_exllama=True)  # 添加修改内容，放弃使用exllama

如果仍然不能正确运行，可以修改大模型目录文件config.json内容:

 "quantization_config": {
    ……
      "use_exllama": false
   }

之后应该可以正常运行。
我这遇到另一个报错：
RuntimeError: “addmm_impl_cpu_” not implemented for ‘Half’。
这个问题处理方法：
修改finetune.py：
model = get_peft_model(model, lora_config)
# 使用python，在cpu训练。使用deepspeed把下面model.float()注释掉
model.float()
这样使用cpu进行训练，训练时间相比gpu大大加大了训练时间。