python统计句子中单词个数_科学网—Python统计单词频数 - 吕波的博文

import re

from collections import Counter

#define a function to print the result by line

def printByLine(tuples):

return( 'n'.join(' '.join(map(str,t)) for t in tuples))

#define a function to print the result alphabetically

def countsSortedAlphabetically(counter, **kw):

return sorted(counter.items(), key = lambda counter:counter[0], **kw)

#open the file

myfile = open("test.txt")

#convert to lower case

myfile = myfile.read().lower()

#match words and save them in a list

words = re.findall(r"w+", myfile)

#calculate the counter of words and save the result in a list

counter = Counter(words).most_common(10)

myfile.close()

print counter

print

print printByLine(counter)

print

print printByLine(countsSortedAlphabetically(dict(counter)))

f = open("test_result.txt",'wb')

#The argument a of this function must be string or buffer

#I can't write printByLine results into test_result.txt for the moment

f.write(str(counter))

f.close()

转载本文请联系原作者获取授权,同时请注明本文来自吕波科学网博客。

链接地址:http://blog.sciencenet.cn/blog-645111-1012675.html

上一篇:Python统计字母频数和频率

下一篇:Python提取句子

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值