Taget:实现wordcount函数,统计英文字符串中每个单词出现的次数,返回一个词典,key为单词,value为对应单词出现的次数。
Input:
Hello world!This is an example.
Word count is fun.Is it fun to count words?Yes, it is fun!
Output:
{'hello': 1, 'world': 1, 'this': 1, 'is': 4, 'an': 1, 'example': 1, 'word': 1, 'count': 2,
'fun': 3, 'it': 2, 'to': 1, 'words': 1, 'yes': 1}
TIPS:记得先去掉标点符号,然后把每个单词转换成小写。
Code:
import re
from collections import defaultdict
def wordcount(text):
# 转换为小写以忽略大小写差异
text = text.lower()
# 使用正则表达式移除标点符号,并将文本分割成单词列表
words = re.findall(r'\b\w+\b', text)
# 使用defaultdict初始化计数器
word_counts = defaultdict(int)
# 遍历单词列表并统计每个单词出现的次数
for word in words:
word_counts[word] += 1
return dict(word_counts)
text = """
Got this panda plush toy for my daughter's birthday,
who loves it and takes it everywhere. It's soft and
super cute, and its face has a friendly look. It's
a bit small for what I paid though. I think there
might be other options that are bigger for the
same price. It arrived a day earlier than expected,
so I got to play with it myself before I gave it
to her.
"""
result = wordcount(text)
print(result)
Pycharm连接开发机,选择断点,右键执行debug。
debug截图: