代码:
import string
path = 'D:/桌面/wiki.txt'
with open(path,'r',encoding="utf-8") as text:
words = [raw_word.strip(string.punctuation).lower() for raw_word in text.read().split()]
words_index = set(words)
counts_dict = {index:words.count(index) for index in words_index}
for word in sorted(counts_dict,key=lambda x: counts_dict[x],reverse=True):
print('{} -- {} times'.format(word,counts_dict[word]))
运行结果:
the -- 58 times a -- 40 times of -- 36 times and -- 29 times to -- 28 times in -- 21 times
-- 18 times an -- 16 times that -- 15 times scp-6599 -- 14 times i -- 14 times
is -- 14 times with -- 13 times from -- 12 times you -- 12 times by -- 10 times
scp-6599-1 -- 10 times was -- 9 times account -- 9 times for -- 9 times
are -- 8 times it -- 8 times on -- 7 times have -- 7 times been -- 7 times
or -- 7 times all -- 6 times has -- 6 times following -- 6 times about -- 6 times
03/28/2008 -- 6 times mon -- 6 times your -- 6 times muppet -- 6 times
hogslice -- 6 times accounts -- 6 times dont -- 5 times this -- 5 times
alt-f4 -- 5 times event -- 5 times no -- 5 times as -- 5 times do -- 5 times
what -- 5 times posts -- 4 times puppets -- 4 times entity -- 4 times
they -- 4 times gregthecarp -- 4 times users -- 4 times shit -- 4 times
mothman -- 4 times forum -- 4 times below -- 4 times class -- 4 times
website -- 4 times it's -- 4 times like -- 4 times thread -- 4 times
central -- 4 times my -- 4 times when -- 3 times under -- 3 times
......