def count_words(filename):
try:
with open(filename) as file:
contents = file.read()
except FileNotFoundError:
msg = 'Sorry, the file '+filename+' does not exist'
print(msg)
else:
words = contents.split()
n_words = len(words)
print(n_words)
filenames = ['alice.txt','pi_digits.txt','hh.txt','little_women.txt','moby_dick.txt','siddhartha.txt']
for filename in filenames:
count_words(filename)
结果如下
29461
3
Sorry, the file hh.txt does not exist
189079
215136
42172
>>>
值得注意的是,如果文本中放的是数字,而不是字母(单词),则不需要用split()来分割,否则会出错,这是因为分割split()将一串数字看成一个字符引起的。如上面的pi_digits.txt的文本如下:
3.1415926535
8979323846
2643383279
分割之后
>>> with open('pi_digits.txt') as file:
w = file.read()
w.split()
['3.1415926535', '8979323846', '2643383279']
可见最开始给的结果3就是这样来的,看成了三个字符。
如果有数字文本,那么去掉split()函数就可以了,还是以pi为例:
def count_words(filename):
try:
with open(filename) as file:
contents = file.read()
except FileNotFoundError:
msg = 'Sorry, the file '+filename+' does not exist'
print(msg)
else:
n_words = len(contents)
print(n_words)
filename = 'pi_digits.txt'
count_words(filename)
结果如下:
38
>>>