import spacy title = 'Personalized Leather Case, Monogram Engraved Phone Case for iPhone 14 Pro, Crossbody Phone Case with Card Solt for iPhone13 12 11 Max, MINI' nlp = spacy.load('en_core_web_sm') doc = nlp(title) noun_phrases = [] for chunk in doc.noun_chunks: noun_phrases.append(chunk.text) print(noun_phrases)
result: ['Personalized Leather Case', 'Monogram Engraved Phone Case', 'iPhone', 'Pro', 'Crossbody Phone Case', 'Card Solt', 'iPhone13', '12 11 Max', 'MINI']
语言包为en_core_web_sm,根据spacy版本到https://github.com/explosion/spacy-models/releases?q=en_core_web_sm&expanded=true下载,pip install en_core_web_sm-3.7.1-py3-none-any.whl,
https://github.com/explosion/spacy-models
en_core_web_sm语言包中的sm含义,
sm
:没有词向量md
:缩减词向量表,包含 20k 个独特向量,可容纳约 500k 个单词lg
:包含约 500k 条目的大型词向量表