word python docvariable_Python:将变量传递到NLTK中的Wordnet Synsets方法中

I need to work on a project that require NLTK so I started learning Python two weeks ago but struggling to understand Python and NLTK.

From the NLTK documentation, I can understand the following codes and they work well if I manually add the word apple and pear into the codes below.

from nltk.corpus import wordnet as wn

apple = wn.synset('apple.n.01')

pear = wn.synset('pear.n.01')

print apple.lch_similarity(pear)

Output: 2.53897387106

However, I need to use the NLTK to work with a list of items. For example, I have a list of items below and I would like to compare the items from list1 with list2 - for example: compare word1 from list1 with every word in list 2, then word2 from list1 with every word from list2 until all words in list1 is compared.

list1 = ["apple", "honey", "drinks", "flowers", "paper"]

list2 = ["pear", "shell", "movie", "fire", "tree", "candle"]

wordFromList1 = list1[0]

wordFromList2 = list2[0]

wordFromList1 = wn.synset(wordFromList1)

wordFromList2 = wn.synset(wordFromList2)

print wordFromList1.lch_similarity(wordFromList2)

The codes above will of course gives an error. Can anyone show me how I can pass a variable into synset method [wn.synset(*pass_variable_in_here*)] so that I can use a double loop to get the lch_similarity values for them. Thank you.

解决方案

wordnet.synset expects a 3-part

name string of the form:

word.pos.nn.

You did not specify the pos.nn part for each word in list1 and

list2.

It seems reasonable to assume that all the words are nouns, so we could try

appending the string '.n.01' to each string in list1 and list2:

for word1, word2 in IT.product(list1, list2):

wordFromList1 = wordnet.synset(word1+'.n.01')

wordFromList2 = wordnet.synset(word2+'.n.02')

That does not work, however. wordnet.synset('drinks.n.01') raises a WordNetError.

On the other hand, the same doc

page shows you can

lookup similar words using the synsets method:

For example, wordnet.synsets('drinks') returns the list:

[Synset('drink.n.01'),

Synset('drink.n.02'),

Synset('beverage.n.01'),

Synset('drink.n.04'),

Synset('swallow.n.02'),

Synset('drink.v.01'),

Synset('drink.v.02'),

Synset('toast.v.02'),

Synset('drink_in.v.01'),

Synset('drink.v.05')]

So at this point, you need to give some thought to what you want the program to do. If you are okay with just picking the first item in this list as a proxy for drinks,

then you could use

for word1, word2 in IT.product(list1, list2):

wordFromList1 = wordnet.synsets(word1)[0]

wordFromList2 = wordnet.synsets(word2)[0]

which would result in a program that looks like this:

import nltk.corpus as corpus

import itertools as IT

wordnet = corpus.wordnet

list1 = ["apple", "honey", "drinks", "flowers", "paper"]

list2 = ["pear", "shell", "movie", "fire", "tree", "candle"]

for word1, word2 in IT.product(list1, list2):

# print(word1, word2)

wordFromList1 = wordnet.synsets(word1)[0]

wordFromList2 = wordnet.synsets(word2)[0]

print('{w1}, {w2}: {s}'.format(

w1 = wordFromList1.name,

w2 = wordFromList2.name,

s = wordFromList1.lch_similarity(wordFromList2)))

which yields

apple.n.01, pear.n.01: 2.53897387106

apple.n.01, shell.n.01: 1.07263680226

apple.n.01, movie.n.01: 1.15267950994

apple.n.01, fire.n.01: 1.07263680226

...

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值