用python会简单一点
def newdic(dicts,n):
list1 = sorted(dicts.items(),key=lambda x:x[1])
return list1[-1:-(n+1):-1]
f = open(r"E:\VS Code\us_constitution.txt","r", encoding="utf-8").read()
txt = f.lower().split()
dic = {}
for word in txt:
if word in dic:
dic[word] = dic[word]+1;
else:
dic[word]=1
del [dic['the'],dic['of'],dic['be'],dic['or'],dic['my'],dic['i'],dic['and'],dic['in'],dic['a'],dic['by'],dic['for'],dic['which'],
dic['any'],dic['such'],dic['as'],dic['have'],dic['on'],dic['he'],dic['is'],dic['from']]
print(newdic(dic,100))`
删除一些连接词之类的词汇,用了很傻很暴力的方法
- 对于C实现的一些想法:
由于无法预知存在多少词汇,所以动态分配是要的
类比python的dict,创建一个类似的结构体存放单词及相应的词频