快速而肮脏的答案是wordnet does this already:
attribute
derivationally related form
剩下的问题是如何以编程方式执行此操作(无需网络抓取).
添加:
wordnet库包装器工具非常强大,它展示了C库接口的广度:
$wn happy
No information available for noun happy
No information available for verb happy
Information available for adj happy
-antsa Antonyms
-synsa Synonyms (ordered by estimated frequency)
-attra Attributes
-deria Derived Forms
-famla Familiarity & Polysemy Count
-grepa List of Compound Words
-over Overview of Senses
$wn happy -deria -n1
Derived Forms of adj happy
Sense 1
happy (vs. unhappy)
RELATED TO->(noun) happiness#1
=> happiness,felicity
RELATED TO->(noun) happiness#2
=> happiness
所以,从Python的角度来说,你可以对wn命令进行子处理,这有点邋or,或者使用已经内置到NLTK中的wordnet工具.
在ubuntu(可能是debian)上,wordnet库和工具可以方便地使用:
sudo apt-get install wordnet wordnet-dev
唉:
$wn pythonic
No information available for pythonic