HuggingFace加载模型失败的解决

Tiffany:)

已于 2024-04-03 15:38:32 修改

阅读量1.1k

点赞数 6

文章标签：深度学习人工智能

于 2024-04-03 15:28:42 首次发布

本文链接：https://blog.csdn.net/weixin_54800645/article/details/137343553

版权

记录一下困扰我一整天的HuggingFace加载模型失败的问题

背景：需加载预训练的某基于CLIP的模型，模型只挂在了HuggingFace上。

首次尝试，直接使用项目提供的example中的代码加载：

from open_clip import create_model_from_pretrained, get_tokenizer # works on open-clip-torch>=2.23.0, timm>=0.9.8

model, preprocess = create_model_from_pretrained('hf-hub:model_name')
tokenizer = get_tokenizer('hf-hub:model_name')

果然报错ConnectionError，显而易见，服务器无法访问hf导致无法直接联网加载

于是给服务器部署了clash，在能ping通huggingface.co的情况下，仍无法加载模型（未解决）

暂时放弃了联网加载，尝试本地手动加载。

项目提供的example加载函数是create_model_from_pretrained

查看其定义：（来自open_clip的源码）

def create_model_from_pretrained(
        model_name: str,
        pretrained: Optional[str] = None,
        precision: str = 'fp32',
        device: Union[str, torch.device] = 'cpu',
        jit: bool = False,
        force_quick_gelu: bool = False,
        force_custom_text: bool = False,
        force_image_size: Optional[Union[int, Tuple[int, int]]] = None,
        image_mean: Optional[Tuple[float, ...]] = None,
        image_std: Optional[Tuple[float, ...]] = None,
        image_interpolation: Optional[str] = None,
        image_resize_mode: Optional[str] = None,  # only effective for inference
        return_transform: bool = True,
        cache_dir: Optional[str] = None,
        **model_kwargs,
):
    force_preprocess_cfg = merge_preprocess_kwargs(
        {}, mean=image_mean, std=image_std, interpolation=image_interpolation, resize_mode=image_resize_mode)

    model = create_model(
        model_name,
        pretrained,
        precision=precision,
        device=device,
        jit=jit,
        force_quick_gelu=force_quick_gelu,
        force_custom_text=force_custom_text,
        force_image_size=force_image_size,
        force_preprocess_cfg=force_preprocess_cfg,
        cache_dir=cache_dir,
        require_pretrained=True,
        **model_kwargs,
    )

    if not return_transform:
        return model

    preprocess = image_transform_v2(
        PreprocessCfg(**model.visual.preprocess_cfg),
        is_train=False,
    )

    return model, preprocess

原代码调用该函数是使用model_name来加载模型的，即定义中的第一个参数，无法直接传入路径

Solution：AutoModel和AutoTokenizer

AutoModel 和 AutoTokenizer 是 Hugging Face transformers库中的两个非常强大的工具类，它们提供了一种快速方便的方式来加载和使用大量不同的预训练模型及其相应的分词器。

AutoModel 可以根据模型的名称或路径自动加载，不需指定模型类别，简化了预训练模型的使用流程。


from transformers import AutoModel, AutoTokenizer

# 假设你已经将模型和分词器保存在了这些路径
local_model_path = "path_to_your_modeldirectory"
local_tokenizer_path = "path_to_your_modeldirectory"

# 从本地路径加载模型和分词器
model = AutoModel.from_pretrained(local_model_path)
tokenizer = AutoTokenizer.from_pretrained(local_tokenizer_path)

Attention!

目录下必须存放的2个文件：

模型配置文件：config.json

模型权重文件：pytorch_model.bin

必须以上述方式命名！！

好啦，模型可以被加载进来啦！