python nltk中文_使用NLTK的FreqDist

最新推荐文章于 2024-03-13 20:51:05 发布

蛋堡学长

最新推荐文章于 2024-03-13 20:51:05 发布

阅读量759

点赞数

文章标签： python nltk中文

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_42311349/article/details/113492976

版权

我试图使用Python获得一组文档的频率分布。由于某种原因，我的代码无法工作，并产生以下错误：Traceback (most recent call last):

File "C:\Documents and Settings\aschein\Desktop\freqdist", line 32, in

fd = FreqDist(corpus_text)

File "C:\Python26\lib\site-packages\nltk\probability.py", line 104, in __init__

self.update(samples)

File "C:\Python26\lib\site-packages\nltk\probability.py", line 472, in update

self.inc(sample, count=count)

File "C:\Python26\lib\site-packages\nltk\probability.py", line 120, in inc

self[sample] = self.get(sample,0) + count

TypeError: unhashable type: 'list'

你能帮忙吗？

这是目前为止的代码：import os

import nltk

from nltk.probability import FreqDist

#The stop=words list

stopwords_doc = open("C:\\Documents and Settings\\aschein\\My Documents\\stopwords.txt").read()

stopwords_list = stopwords_doc.split()

stopwords = nltk.Text(stopwords_list)

corpus = []

#Directory of documents

directory = "C:\\Documents and Settings\\aschein\\My Documents\\comments"

listing = os.listdir(directory)

#append all documents in directory into a single 'document' (list)

for doc in listing:

doc_name = "C:\\Documents and Settings\\aschein\\My Documents\\comments\\" + doc

input = open(doc_name).read()

input = input.split()

corpus.append(input)

#Turn list into Text form for NLTK

corpus_text = nltk.Text(corpus)

#Remove stop-words

for w in corpus_text:

if w in stopwords:

corpus_text.remove(w)

fd = FreqDist(corpus_text)

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python nltk中文_使用NLTK的FreqDist

我试图使用Python获得一组文档的频率分布。由于某种原因，我的代码无法工作，并产生以下错误：Traceback (most recent call last):File "C:\Documents and Settings\aschein\Desktop\freqdist", line 32, in fd = FreqDist(corpus_text)File "C:\Python26\lib\...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。