【已解决】tokenizer.chat_template is not set and no template argument was passed

最新推荐文章于 2024-09-24 08:48:12 发布

小饼干超人

最新推荐文章于 2024-09-24 08:48:12 发布

阅读量1k

点赞数 3

分类专栏：大模型文章标签：开发语言语言模型

本文链接：https://blog.csdn.net/m0_37586991/article/details/141269419

版权

大模型专栏收录该内容

1 篇文章 0 订阅

订阅专栏

finetuning的时候遇到这个问题：

Error in applying chat template from request: Cannot use apply_chat_template() because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

原因：
在模型合并后，我只复制了tokenizer.json到合并模型的文件夹，没有把tokenizer_config.json也复制过去


    # 模型合并存储
    merged_model = model.merge_and_unload()
    # 将权重保存为safetensors格式的权重, 且每个权重文件最大不超过2GB(2048MB)
    merged_model.save_pretrained(config.merge_model_dir, max_shard_size="2048MB", safe_serialization=True)
    
    # 复制tokenizer.json到新文件夹
    shutil.copy(f'{config.model_local_path}tokenizer.json', config.merge_model_dir)

解决：


    # 模型合并存储
    merged_model = model.merge_and_unload()
    # 将权重保存为safetensors格式的权重, 且每个权重文件最大不超过2GB(2048MB)
    merged_model.save_pretrained(config.merge_model_dir, max_shard_size="2048MB", safe_serialization=True)
    
    # shutil.copy(f'{config.model_local_path}tokenizer.json', config.merge_model_dir)
	
	# 将tokenizer也保存到 merge_model_dir
    tokenizer.save_pretrained(config.merge_model_dir)