下表显示了库中对每个模型的当前支持,无论它们是否具有 Python 分词器(称为“慢速”)。由 🤗 Tokenizers 库支持的“快速”分词器,无论它们是否支持 Jax(通过 Flax)、PyTorch 和/或 TensorFlow。
Model | Tokenizer slow | Tokenizer fast | PyTorch support | TensorFlow support | Flax Support |
ALBERT | ✅ | ✅ | ✅ | ✅ | ✅ |
ALIGN | ❌ | ❌ | ✅ | ❌ | ❌ |
AltCLIP | ❌ | ❌ | ✅ | ❌ | ❌ |
Audio Spectrogram Transformer | ❌ | ❌ | ✅ | ❌ | ❌ |
BART | ✅ | ✅ | ✅ | ✅ | ✅ |
BEiT | ❌ | ❌ | ✅ | ❌ | ✅ |
BERT | ✅ | ✅ | ✅ | ✅ | ✅ |
Bert Generation | ✅ | ❌ | ✅ | ❌ | ❌ |
BigBird | ✅ | ✅ | ✅ | ❌ | ✅ |
BigBird-Pegasus | ❌ | ❌ | ✅ | ❌ | ❌ |
BioGpt | ✅ | ❌ | ✅ | ❌ | ❌ |
BiT | ❌ | ❌ | ✅ | ❌ | ❌ |
Blenderbot | ✅ | ✅ | ✅ | ✅ | ✅ |
BlenderbotSmall | ✅ | ✅ | ✅ | ✅ | ✅ |
BLIP | ❌ | ❌ | ✅ | ✅ | ❌ |
BLIP-2 | ❌ | ❌ | ✅ | ❌ | ❌ |
BLOOM | ❌ | ✅ | ✅ | ❌ | ❌ |
BridgeTower | ❌ | ❌ | ✅ | ❌ | ❌ |
CamemBERT | ✅ | ✅ | ✅ | ✅ | ❌ |
CANINE | ✅ | ❌ | ✅ | ❌ | ❌ |
Chinese-CLIP | ❌ | ❌ | ✅ | ❌ | ❌ |
CLAP | ❌ | ❌ | ✅ | ❌ | ❌ |
CLIP | ✅ | ✅ | ✅ | ✅ | ✅ |
CLIPSeg | ❌ | ❌ | ✅ | ❌ | ❌ |
CodeGen | ✅ | ✅ | ✅ | ❌ | ❌ |
Conditional DETR | ❌ | ❌ | ✅ | ❌ | ❌ |
ConvBERT | ✅ | ✅ | ✅ | ✅ | ❌ |
ConvNeXT | ❌ | ❌ | ✅ | ✅ | ❌ |
ConvNeXTV2 | ❌ | ❌ | ✅ | ❌ | ❌ |
CPM-Ant | ✅ | ❌ | ✅ | ❌ | ❌ |
CTRL | ✅ | ❌ | ✅ | ✅ | ❌ |
CvT | ❌ | ❌ | ✅ | ✅ | ❌ |
Data2VecAudio | ❌ | ❌ | ✅ | ❌ | ❌ |
Data2VecText | ❌ | ❌ | ✅ | ❌ | ❌ |
Data2VecVision | ❌ | ❌ | ✅ | ✅ | ❌ |
DeBERTa | ✅ | ✅ | ✅ | ✅ | ❌ |
DeBERTa-v2 | ✅ | ✅ | ✅ | ✅ | ❌ |
Decision Transformer | ❌ | ❌ | ✅ | ❌ | ❌ |
Deformable DETR | ❌ | ❌ | ✅ | ❌ | ❌ |
DeiT | ❌ | ❌ | ✅ | ✅ | ❌ |
DETA | ❌ | ❌ | ✅ | ❌ | ❌ |
DETR | ❌ | ❌ | ✅ | ❌ | ❌ |
DiNAT | ❌ | ❌ | ✅ | ❌ | ❌ |
DistilBERT | ✅ | ✅ | ✅ | ✅ | ✅ |
DonutSwin | ❌ | ❌ | ✅ | ❌ | ❌ |
DPR | ✅ | ✅ | ✅ | ✅ | ❌ |
DPT | ❌ | ❌ | ✅ | ❌ | ❌ |
EfficientFormer | ❌ | ❌ | ✅ | ❌ | ❌ |
EfficientNet | ❌ | ❌ | ✅ | ❌ | ❌ |
ELECTRA | ✅ | ✅ | ✅ | ✅ | ✅ |
Encoder decoder | ❌ | ❌ | ✅ | ✅ | ✅ |
ERNIE | ❌ | ❌ | ✅ | ❌ | ❌ |
ErnieM | ✅ | ❌ | ✅ | ❌ | ❌ |
ESM | ✅ | ❌ | ✅ | ✅ | ❌ |
FairSeq Machine-Translation | ✅ | ❌ | ✅ | ❌ | ❌ |
FlauBERT | ✅ | ❌ | ✅ | ✅ | ❌ |
FLAVA | ❌ | ❌ | ✅ | ❌ | ❌ |
FNet | ✅ | ✅ | ✅ | ❌ | ❌ |
Funnel Transformer | ✅ | ✅ | ✅ | ✅ | ❌ |
GIT | ❌ | ❌ | ✅ | ❌ | ❌ |
GLPN | ❌ | ❌ | ✅ | ❌ | ❌ |
GPT Neo | ❌ | ❌ | ✅ | ❌ | ✅ |
GPT NeoX | ❌ | ✅ | ✅ | ❌ | ❌ |
GPT NeoX Japanese | ✅ | ❌ | ✅ | ❌ | ❌ |
GPT-J | ❌ | ❌ | ✅ | ✅ | ✅ |
GPT-Sw3 | ✅ | ✅ | ✅ | ✅ | ✅ |
GPTBigCode | ❌ | ❌ | ✅ | ❌ | ❌ |
GPTSAN-japanese | ✅ | ❌ | ✅ | ❌ | ❌ |
Graphormer | ❌ | ❌ | ✅ | ❌ | ❌ |
GroupViT | ❌ | ❌ | ✅ | ✅ | ❌ |
Hubert | ❌ | ❌ | ✅ | ✅ | ❌ |
I-BERT | ❌ | ❌ | ✅ | ❌ | ❌ |
ImageGPT | ❌ | ❌ | ✅ | ❌ | ❌ |
Informer | ❌ | ❌ | ✅ | ❌ | ❌ |
Jukebox | ✅ | ❌ | ✅ | ❌ | ❌ |
LayoutLM | ✅ | ✅ | ✅ | ✅ | ❌ |
LayoutLMv2 | ✅ | ✅ | ✅ | ❌ | ❌ |
LayoutLMv3 | ✅ | ✅ | ✅ | ✅ | ❌ |
LED | ✅ | ✅ | ✅ | ✅ | ❌ |
LeViT | ❌ | ❌ | ✅ | ❌ | ❌ |
LiLT | ❌ | ❌ | ✅ | ❌ | ❌ |
LLaMA | ✅ | ✅ | ✅ | ❌ | ❌ |
Longformer | ✅ | ✅ | ✅ | ✅ | ❌ |
LongT5 | ❌ | ❌ | ✅ | ❌ | ✅ |
LUKE | ✅ | ❌ | ✅ | ❌ | ❌ |
LXMERT | ✅ | ✅ | ✅ | ✅ | ❌ |
M-CTC-T | ❌ | ❌ | ✅ | ❌ | ❌ |
M2M100 | ✅ | ❌ | ✅ | ❌ | ❌ |
Marian | ✅ | ❌ | ✅ | ✅ | ✅ |
MarkupLM | ✅ | ✅ | ✅ | ❌ | ❌ |
Mask2Former | ❌ | ❌ | ✅ | ❌ | ❌ |
MaskFormer | ❌ | ❌ | ✅ | ❌ | ❌ |
MaskFormerSwin | ❌ | ❌ | ❌ | ❌ | ❌ |
mBART | ✅ | ✅ | ✅ | ✅ | ✅ |
MEGA | ❌ | ❌ | ✅ | ❌ | ❌ |
Megatron-BERT | ❌ | ❌ | ✅ | ❌ | ❌ |
MGP-STR | ✅ | ❌ | ✅ | ❌ | ❌ |
MobileBERT | ✅ | ✅ | ✅ | ✅ | ❌ |
MobileNetV1 | ❌ | ❌ | ✅ | ❌ | ❌ |
MobileNetV2 | ❌ | ❌ | ✅ | ❌ | ❌ |
MobileViT | ❌ | ❌ | ✅ | ✅ | ❌ |
MPNet | ✅ | ✅ | ✅ | ✅ | ❌ |
MT5 | ✅ | ✅ | ✅ | ✅ | ✅ |
MVP | ✅ | ✅ | ✅ | ❌ | ❌ |
NAT | ❌ | ❌ | ✅ | ❌ | ❌ |
Nezha | ❌ | ❌ | ✅ | ❌ | ❌ |
NLLB-MOE | ❌ | ❌ | ✅ | ❌ | ❌ |
Nyströmformer | ❌ | ❌ | ✅ | ❌ | ❌ |
OneFormer | ❌ | ❌ | ✅ | ❌ | ❌ |
OpenAI GPT | ✅ | ✅ | ✅ | ✅ | ❌ |
OpenAI GPT-2 | ✅ | ✅ | ✅ | ✅ | ✅ |
OPT | ❌ | ❌ | ✅ | ✅ | ✅ |
OWL-ViT | ❌ | ❌ | ✅ | ❌ | ❌ |
Pegasus | ✅ | ✅ | ✅ | ✅ | ✅ |
PEGASUS-X | ❌ | ❌ | ✅ | ❌ | ❌ |
Perceiver | ✅ | ❌ | ✅ | ❌ | ❌ |
Pix2Struct | ❌ | ❌ | ✅ | ❌ | ❌ |
PLBart | ✅ | ❌ | ✅ | ❌ | ❌ |
PoolFormer | ❌ | ❌ | ✅ | ❌ | ❌ |
ProphetNet | ✅ | ❌ | ✅ | ❌ | ❌ |
QDQBert | ❌ | ❌ | ✅ | ❌ | ❌ |
RAG | ✅ | ❌ | ✅ | ✅ | ❌ |
REALM | ✅ | ✅ | ✅ | ❌ | ❌ |
Reformer | ✅ | ✅ | ✅ | ❌ | ❌ |
RegNet | ❌ | ❌ | ✅ | ✅ | ✅ |
RemBERT | ✅ | ✅ | ✅ | ✅ | ❌ |
ResNet | ❌ | ❌ | ✅ | ✅ | ✅ |
RetriBERT | ✅ | ✅ | ✅ | ❌ | ❌ |
RoBERTa | ✅ | ✅ | ✅ | ✅ | ✅ |
RoBERTa-PreLayerNorm | ❌ | ❌ | ✅ | ✅ | ✅ |
RoCBert | ✅ | ❌ | ✅ | ❌ | ❌ |
RoFormer | ✅ | ✅ | ✅ | ✅ | ✅ |
SegFormer | ❌ | ❌ | ✅ | ✅ | ❌ |
SEW | ❌ | ❌ | ✅ | ❌ | ❌ |
SEW-D | ❌ | ❌ | ✅ | ❌ | ❌ |
Speech Encoder decoder | ❌ | ❌ | ✅ | ❌ | ✅ |
Speech2Text | ✅ | ❌ | ✅ | ✅ | ❌ |
Speech2Text2 | ✅ | ❌ | ❌ | ❌ | ❌ |
SpeechT5 | ✅ | ❌ | ✅ | ❌ | ❌ |
Splinter | ✅ | ✅ | ✅ | ❌ | ❌ |
SqueezeBERT | ✅ | ✅ | ✅ | ❌ | ❌ |
Swin Transformer | ❌ | ❌ | ✅ | ✅ | ❌ |
Swin Transformer V2 | ❌ | ❌ | ✅ | ❌ | ❌ |
Swin2SR | ❌ | ❌ | ✅ | ❌ | ❌ |
SwitchTransformers | ❌ | ❌ | ✅ | ❌ | ❌ |
T5 | ✅ | ✅ | ✅ | ✅ | ✅ |
Table Transformer | ❌ | ❌ | ✅ | ❌ | ❌ |
TAPAS | ✅ | ❌ | ✅ | ✅ | ❌ |
Time Series Transformer | ❌ | ❌ | ✅ | ❌ | ❌ |
TimeSformer | ❌ | ❌ | ✅ | ❌ | ❌ |
Trajectory Transformer | ❌ | ❌ | ✅ | ❌ | ❌ |
Transformer-XL | ✅ | ❌ | ✅ | ✅ | ❌ |
TrOCR | ❌ | ❌ | ✅ | ❌ | ❌ |
TVLT | ❌ | ❌ | ✅ | ❌ | ❌ |
UniSpeech | ❌ | ❌ | ✅ | ❌ | ❌ |
UniSpeechSat | ❌ | ❌ | ✅ | ❌ | ❌ |
UPerNet | ❌ | ❌ | ✅ | ❌ | ❌ |
VAN | ❌ | ❌ | ✅ | ❌ | ❌ |
VideoMAE | ❌ | ❌ | ✅ | ❌ | ❌ |
ViLT | ❌ | ❌ | ✅ | ❌ | ❌ |
Vision Encoder decoder | ❌ | ❌ | ✅ | ✅ | ✅ |
VisionTextDualEncoder | ❌ | ❌ | ✅ | ✅ | ✅ |
VisualBERT | ❌ | ❌ | ✅ | ❌ | ❌ |
ViT | ❌ | ❌ | ✅ | ✅ | ✅ |
ViT Hybrid | ❌ | ❌ | ✅ | ❌ | ❌ |
ViTMAE | ❌ | ❌ | ✅ | ✅ | ❌ |
ViTMSN | ❌ | ❌ | ✅ | ❌ | ❌ |
Wav2Vec2 | ✅ | ❌ | ✅ | ✅ | ✅ |
Wav2Vec2-Conformer | ❌ | ❌ | ✅ | ❌ | ❌ |
WavLM | ❌ | ❌ | ✅ | ❌ | ❌ |
Whisper | ✅ | ✅ | ✅ | ✅ | ✅ |
X-CLIP | ❌ | ❌ | ✅ | ❌ | ❌ |
X-MOD | ❌ | ❌ | ✅ | ❌ | ❌ |
XGLM | ✅ | ✅ | ✅ | ✅ | ✅ |
XLM | ✅ | ❌ | ✅ | ✅ | ❌ |
XLM-ProphetNet | ✅ | ❌ | ✅ | ❌ | ❌ |
XLM-RoBERTa | ✅ | ✅ | ✅ | ✅ | ✅ |
XLM-RoBERTa-XL | ❌ | ❌ | ✅ | ❌ | ❌ |
XLNet | ✅ | ✅ | ✅ | ✅ | ❌ |
YOLOS | ❌ | ❌ | ✅ | ❌ | ❌ |
YOSO | ❌ | ❌ | ✅ | ❌ | ❌ |
Transformer 模型如何在高层次上工作,并讨论了迁移学习和微调的重要性。一个关键方面是您可以使用完整的体系结构或仅使用编码器或解码器,具体取决于您要解决的任务类型。下表总结了这一点:
Encoder | ALBERT, BERT, DistilBERT, ELECTRA, RoBERTa | 句子分类、命名实体识别、抽取式问答 |
Decoder | CTRL, GPT, GPT-2, Transformer XL | 文本生成 |
Encoder-decoder | BART, T5, Marian, mBART | 总结、翻译、生成式问答 |