https://github.com/pytorch/text/issues/481 这个issue提到了,我这边的解决方法是把语言模型换成小的
# tokenizer_language 使用sm大小,要不然加载IMDB用时太久!
TEXT = data.Field(tokenize = 'spacy',
tokenizer_language = 'en_core_web_sm',
include_lengths = True)
LABEL = data.LabelField(dtype = torch.float)
en_core_web_sm是一个python包,安装方法见我的这篇博客https://blog.csdn.net/weixin_43390599/article/details/116887184