def main():
# Prompt the user to enter a file
filename = raw_input("Enter a filename: ").strip()
infile = open(filename, "r") # Open the file
wordCounts = {} # Create an empty dictionary to count words
for line in infile:
processLine(line.lower(), wordCounts)
pairs = list(wordCounts.items()) # Get pairs from the dictionary
items = [[x, y] for (y, x) in pairs] # Reverse pairs in the list
items.sort() # sort pairs in items
for i in range(len(items) - 1, len(items) - 11, -1):
print(items[i][1] + "\t" + str(items[i][0]))
# Count each word in the line
def processLine(line, wordCounts):
line = replacePunctuations(line) # Replace punctuation with space
words = line.split()
for word in words:
if word in wordCounts:
wordCounts[word] += 1
else:
wordCounts[word] =1
# Replace punctuation in the line with space
def replacePunctuations(line):
for ch in line:
if ch in "~@#$%^&*()_-+=~<>?/,.;:!{}[]|'\"":
line = line.replace(ch, " ")
return line
main()
编写程序统计一个文本文件中单词的出现次数,并将出现次数最多的单词和它们的出现次数按降序显示。字典对象没有sort方法,那么如何对它进行排序呢?将字典的每一对放入一个列表中,然后对这个列表排序。如果使用sort方法对这个列表排序,程序将按每对的第一个元素进行排序,但是我们对出现次数(每对的第二个元素)进行排序。因此,需要利用倒置每一对来创建一个新的列表,然后利用sort方法。
—摘自《Python 程序语言设计》 李娜译