[nlp-002]《python自然语言处理》读书笔记01

#!/usr/bin/env python


"""
参考书:《python自然语言处理》
安装nltk: pip install nltk
"""

import nltk

print(dir(nltk))

#在弹出的窗口选择collections-book下载
nltk.download()
#!/usr/bin/env python


"""
nltk的源码
./versions/anaconda3-5.0.1/lib/python3.6/site-packages/nltk


book.py比较简单,90行代码,从各部分导入文本

text1的type是 nltk.text.Text。类Text定义在text.py里。
moby = Text(nltk.corpus.gutenberg.words('melville-moby_dick.txt'))
melville-moby_dick.txt是一个ascii的可阅读文本。

词意消歧,一个词有多个含义,根据上下文确认合适的含义。比如,by有多个含义:
a. The lost children were found by the searchers (施事)
b. The lost children were found by the mountain (位置格)
c. The lost children were found by the afternoon (时间)


指代消解anaphora resolution。 代词有多个含义:
a. The thieves stole the paintings. They were subsequently sold .
b. The thieves stole the paintings. They were
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值