[WARNING|logging.py:329] 2025-03-04 21:16:20,919 >> Unsloth: Dropout = 0 is supported for fast patching. You are using dropout = 0.01.
Unsloth will patch all other layers, except LoRA matrices, causing a performance hit.
Unsloth: Offloading input_embeddings to disk to save VRAM
Traceback (most recent call last):
File "/root/miniconda3/bin/llamafactory-cli", line 8, in <module>
sys.exit(main())
^^^^^^
File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/cli.py", line 112, in main
run_exp()
File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/train/tuner.py", line 93, in run_exp
_training_function(config={"args": args, "callbacks": callbacks})
File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/train/tuner.py", line 67, in _training_function
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 52, in run_sft
model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/model/loader.py", line 169, in load_model
model = init_adapter(config, model, model_args, finetuning_args, is_trainable)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/model/adapter.py", line 299, in init_adapter
model = _setup_lora_tuning(
^^^^^^^^^^^^^^^^^^^
File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/model/adapter.py", line 235, in _setup_lora_tuning
model = get_unsloth_peft_model(model, model_args, peft_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/model/model_utils/unsloth.py", line 79, in get_unsloth_peft_model
return FastLanguageModel.get_peft_model(**peft_kwargs, **unsloth_peft_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/unsloth/models/llama.py", line 2377, in get_peft_model
offload_input_embeddings(model, temporary_location)
File "/root/miniconda3/lib/python3.12/site-packages/unsloth/models/_utils.py", line 767, in offload_input_embeddings
offloaded_W = offload_to_disk(model.get_input_embeddings(), model, "input_embeddings", temporary_location)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/unsloth/models/_utils.py", line 760, in offload_to_disk
offloaded_W = torch.load(filename, map_location = "cpu", mmap = True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/torch/serialization.py", line 1470, in load
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Unsupported operand 149
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.