下载transformers的预训练模型时,使用bert-base-cased等模型在AutoTokenizer和AutoModel时并不会有太多问题。但在下载deberta-v3-base时可能会发生很多报错。
首先,
from transformers import AutoTokneizer, AutoModel, AutoConfig
checkpoint = 'microsoft/deberta-v3-base'
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
此时会发生报错,提示
ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a `tokenizers` library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
解决方法是
pip install transformers sentencepiece
继续导入tokenizer,又会有如下报错
ImportError:
DeberetaV2Converter requires the protobuf library but it was not f