word python docvariable_Python：将变量传递到NLTK中的Wordnet Synsets方法中

最新推荐文章于 2021-09-22 12:59:41 发布

weixin_39626089

最新推荐文章于 2021-09-22 12:59:41 发布

阅读量189

点赞数

文章标签： word python docvariable

本文链接：https://blog.csdn.net/weixin_39626089/article/details/111459437

版权

I need to work on a project that require NLTK so I started learning Python two weeks ago but struggling to understand Python and NLTK.

From the NLTK documentation, I can understand the following codes and they work well if I manually add the word apple and pear into the codes below.

from nltk.corpus import wordnet as wn

apple = wn.synset('apple.n.01')

pear = wn.synset('pear.n.01')

print apple.lch_similarity(pear)

Output: 2.53897387106

However, I need to use the NLTK to work with a list of items. For example, I have a list of items below and I would like to compare the items from list1 with list2 - for example: compare word1 from list1 with every word in list 2, then word2 from list1 with every word from list2 until all words in list1 is compared.

list1 = ["apple", "honey", "drinks", "flowers", "paper"]

list2 = ["pear", "shell", "movie", "fire", "tree", "candle"]

wordFromList1 = list1[0]

wordFromList2 = list2[0]

wordFromList1 = wn.synset(wordFromList1)

wordFromList2 = wn.synset(wordFromList2)

print wordFromList1.lch_similarity(wordFromList2)

The codes above will of course gives an error. Can anyone show me how I can pass a variable into synset method [wn.synset(*pass_variable_in_here*)] so that I can use a double loop to get the lch_similarity values for them. Thank you.

解决方案

wordnet.synset expects a 3-part

name string of the form:

word.pos.nn.

You did not specify the pos.nn part for each word in list1 and

list2.

It seems reasonable to assume that all the words are nouns, so we could try

appending the string '.n.01' to each string in list1 and list2:

for word1, word2 in IT.product(list1, list2):

wordFromList1 = wordnet.synset(word1+'.n.01')

wordFromList2 = wordnet.synset(word2+'.n.02')

That does not work, however. wordnet.synset('drinks.n.01') raises a WordNetError.

On the other hand, the same doc

page shows you can

lookup similar words using the synsets method:

For example, wordnet.synsets('drinks') returns the list:

[Synset('drink.n.01'),

Synset('drink.n.02'),

Synset('beverage.n.01'),

Synset('drink.n.04'),

Synset('swallow.n.02'),

Synset('drink.v.01'),

Synset('drink.v.02'),

Synset('toast.v.02'),

Synset('drink_in.v.01'),

Synset('drink.v.05')]

So at this point, you need to give some thought to what you want the program to do. If you are okay with just picking the first item in this list as a proxy for drinks,

then you could use

for word1, word2 in IT.product(list1, list2):

wordFromList1 = wordnet.synsets(word1)[0]

wordFromList2 = wordnet.synsets(word2)[0]

which would result in a program that looks like this:

import nltk.corpus as corpus

import itertools as IT

wordnet = corpus.wordnet

list1 = ["apple", "honey", "drinks", "flowers", "paper"]

list2 = ["pear", "shell", "movie", "fire", "tree", "candle"]

for word1, word2 in IT.product(list1, list2):

# print(word1, word2)

wordFromList1 = wordnet.synsets(word1)[0]

wordFromList2 = wordnet.synsets(word2)[0]

print('{w1}, {w2}: {s}'.format(

w1 = wordFromList1.name,

w2 = wordFromList2.name,

s = wordFromList1.lch_similarity(wordFromList2)))

which yields

apple.n.01, pear.n.01: 2.53897387106

apple.n.01, shell.n.01: 1.07263680226

apple.n.01, movie.n.01: 1.15267950994

apple.n.01, fire.n.01: 1.07263680226

...

weixin_39626089

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
word python docvariable_Python：将变量传递到NLTK中的Wordnet Synsets方法中

I need to work on a project that require NLTK so I started learning Python two weeks ago but struggling to understand Python and NLTK.From the NLTK documentation, I can understand the following codes ...
复制链接

扫一扫