背景
《命运》和《寻梦》都是著名科幻作家倪匡的科幻作品。这里给出一个《命运》和《寻梦》的网络版本,文件名为“命运-网络版.txt”和“寻梦-网络版.txt”。
问题一及其解答
问题
请编写程序,对这两个文本中出现的字符进行统计,字符与出现次数之间用冒号:分隔,将两个文件前 100 个最常用字符分别输出保存到“命运-字符统计.txt”和“寻梦-字符统计.txt”文件中,该文件要求采用 CSV 格式存储,参考格式如下(注意,不统计回车字符):
命:90, 运:80, 寻:70, 梦:60(略)
解答
总程序框图
子程序1
子程序2
子程序3
数据流图
程序代码
with open("寻梦-网络版.txt",'r',encoding="UTF-8") as fi:
lines = fi.readlines()
stat = {}
for line in lines:
for elm in line:
## if elm in stat:
## stat[elm] +=1
## else:
## stat[elm] = 0
stat[elm] = stat.get(elm,0)+1
del stat['\n']
ls0 = list(stat.items())
ls0.sort(key = lambda x:x[1], reverse = True)
ls = []
##for i in range(100):
## k,v = ls0[i]
## ls.append("{}:{}".format(k,v))
for (k,v) in ls0:
## k,v = ls0[i]
ls.append("{}:{}".format(k,v))
with open("寻梦-字符统计.txt",'w',encoding='UTF-8') as fo:
fo.writelines(",".join(ls[0:100]))
print(",".join(ls[0:10]))
问题二及其解答
问题
请编写程序,对“命运-字符统计.txt”和“寻梦-字符统计.txt”中出现的相同字符打印输出。“相同字符.txt”文件中,字符间使用逗号分隔。
解答
总程序框图
子程序1
子程序2
子程序3
数据流图
程序代码
with open("命运-字符统计.txt","r",encoding="UTF-8") as fi:
ls = fi.readline().split(",")
with open("寻梦-字符统计.txt","r",encoding="UTF-8") as fi1:
ls1 = fi1.readline().split(",")
def GetWordList(InputList,length):
WordList = []
for i in range(length):
flag = True
for j in range(len(InputList[i])):
if InputList[i][j]==':':
flag = False
if flag == True:
WordList.append(InputList[i][j])
return WordList
ls2 = GetWordList(ls,100)
ls3 = GetWordList(ls1,100)
ls4 = []
for char in ls2:
if char in ls3:
ls4.append(char)
with open("相同字符.txt","w",encoding="UTF-8") as fo:
fo.writelines(",".join(ls4))