sm/md/lg表示模型的大小(small, medium, large)
差别在于准确率和加载时间
The en_core_web_lg (788 MB) compared to en_core_web_sm (10 MB):
LAS: 90.07% vs 89.66%
POS: 96.98% vs 96.78%
UAS: 91.83% vs 91.53%
NER F-score: 86.62% vs 85.86%
NER precision: 87.03% vs 86.33%
NER recall: 86.20% vs 85.39%
All that while en_core_web_lg is 79 times larger, hence loads a lot more slowly.
建议在开发时使用en_core_web_sm,然后在应用中切换到更大的模型。只需更改加载的模型即可轻松切换。
nlp = spacy.load("en_core_web_lg")
参考:What is difference between en_core_web_sm, en_core_web_md and en_core_web_lg model of spacy?