python学习笔记

最新推荐文章于 2024-05-09 19:48:23 发布

??yumyfc

最新推荐文章于 2024-05-09 19:48:23 发布

阅读量470

点赞数

本文链接：https://blog.csdn.net/yumyfc/article/details/104185366

版权

import jieba
txt = open(“threekingdoms.txt”,“r”,encoding=“utf-8”).read()
excludes={“将军”,“却说”,“荆州”,“二人”,“不可”,“不能”,“如此”}
words=jieba.lcut(txt)
counts={}
for word in words:
if len(word)==1:
continue
elif word “诸葛亮” or word"孔明曰":
rword=“孔明”
elif word “关公” or word"云长":
rword=“关羽”
elif word “玄德” or word"玄德曰":
rword=“刘备”
elif word “孟德” or word"丞相曰":
rword=“曹操”
else:
rword=word
counts[rword]=counts.get(rword,0)+1
for word in excludes:
del counts[word]
items=list(counts.items())
items.sort(key=lambda x:x[1],reverse=True) #reverse() 反转列表元素的排列（与字母顺序无关）
#.sort()按字母顺序排序（永久的）
.sort(reverse=True)按字母倒序排列（永久的）
sorted(变量)按字母顺序排列但不改变原有序列顺序
sorted(变量,reverse=True)按字母倒序排列但不改变原有序列顺序
for i in range(10): #range()函数，可以生成一个整数序列，再通过list()函数可以转换为list。比如range(5)生成的序列是从0开始小于5的整数。
word,count=items[i]
print("{0:<10}{1:>5".format(word,count))
Python 中单行注释使用 #，多行注释使用三个单引号（’’’）或三个双引号（"""）
以上代码是用来找出《三国演义》中各个人物出场的次数，但是可以看到代码不是短短的几行，而是比较复杂的。可见编程也不是简单万能的，也需要不断地根据具体研究问题进行更改。它会识别出一些“不是名字的名字”，也会将表示同一个人的两个或者多个名字识别成不同的人，就会出现错误。这时候还是需要人为加入更改。

??yumyfc

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python学习笔记

import jiebatxt = open(“threekingdoms.txt”,“r”,encoding=“utf-8”).read()excludes={“将军”,“却说”,“荆州”,“二人”,“不可”,“不能”,“如此”}words=jieba.lcut(txt)counts={}for word in words:if len(word)==1:continueelif...
复制链接

扫一扫