huggingface 被ban了,但还是有一些国内镜像(HF-Mirror)可以下载模型文件,当然可以手动下载但觉得比较麻烦,我想要自动下载,教程:
假设我要下载一个阿拉伯语的预训练模型,代码如下:
from transformers import AutoTokenizer, AutoModel
#load MAEBERT model from huggingface
MARBERT_tokenizer = AutoTokenizer.from_pretrained("UBC-NLP/MARBERT")
MARBERT_model = AutoModel.from_pretrained("UBC-NLP/MARBERT")
但是会报下面的错:
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like UBC-NLP/MARBERT is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
解决方案(在linux环境),在终端:
export HF_ENDPOINT=https://hf-mirror.com
export HF_HUB_OFFLINE=True
(如果不行试试下面这一行)
export HF_HUB_OFFLINE=False (works for me)
结果: