安装:
import nltk
nltk.download('wordnet')
使用:
from nltk.corpus import wordnet as wn
1.查询一个词所在的所有词集(synsets)
>>>wn.synsets('dog') # 可以添加pos属性,pos值可以为——NOUN,VERB,ADJ,ADV… 如pos=wn.NOUN
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
每一同义词集由3部分构成:word.pos.nn, 其中pos表示词性,nn01代表只有一种同义词集。
2.查询一个同义词集中的所有词
>>> wn.synset('dog.n.01').lemma_names( )
['dog', 'domestic_dog', 'Canis_familiaris']
3.查询一个同义词集的下位词
>>> wn.synset('car.n.1').hyponyms()
[Synset('ambulance.n.01'), Synset('beach_wagon.n.01'), Synset('bus.n.04'), Synset('cab.n.03'), Synset('compact.n.03'), Synset('convertible.n.01'), Synset('coupe.n.01'), Synset('cruiser.n.01'), Synset('electric.n.01'), Synset('gas_guzzler.n.01'), Synset('hardtop.n.01'), Synset('hatchback.n.01'), Synset('horseless_carriage.n.01'), Synset('hot_rod.n.01'), Synset('jeep.n.01'), Synset('limousine.n.01'), Synset('loaner.n.02'), Synset('minicar.n.01'), Synset('minivan.n.01'), Synset('model_t.n.01'), Synset('pace_car.n.01'), Synset('racer.n.02'), Synset('roadster.n.01'), Synset('sedan.n.01'), Synset('sport_utility.n.01'), Synset('sports_car.n.01'), Synset('stanley_steamer.n.01'), Synset('stock_car.n.01'), Synset('subcompact.n.01'), Synset('touring_car.n.01'), Synset('used-car.n.01')]
4.查询一个同义词集的上位词
>>> wn.synset('car.n.1').hypernyms()
[Synset('motor_vehicle.n.01')]
5.查询两个同义词集之间的语义相似度
>>> dog = wn.synset('dog.n.01')
>>> cat = wn.synset('cat.n.01')
>>> dog.path_similarity(cat)
0.2
path_similarity函数,值从0-1,越大表示相似度越高