python-如何计算句子中的单词数,而忽略数字,标点和空格?
我该如何计算句子中的单词数? 我正在使用Python。
例如,我可能具有以下字符串:
string = "I am having a very nice 23!@$ day. "
那将是7个字。 我在每个单词之后/之前以及涉及数字或符号时的随机空格有麻烦。
8个解决方案
79 votes
没有任何参数的None在运行空白字符时会拆分:
>>> s = 'I am having a very nice day.'
>>>
>>> len(s.split())
7
从链接的文档中:
如果未指定sep或将其指定为None,则将应用不同的拆分算法:连续的空白行将被视为单个分隔符,并且如果字符串的开头或结尾处有空格,则结果在开头或结尾将不包含空字符串。
arshajii answered 2020-01-03T02:37:26Z
48 votes
您可以使用regex.findall():
import re
line = " I am having a very nice day."
count = len(re.findall(r'\w+', line))
print (count)
karthikr answered 2020-01-03T02:37:46Z
4 votes
这是使用正则表达式的简单字计数器。 该脚本包含一个循环,您可以在完成后终止该循环。
#word counter using regex
import re
while True:
string =raw_input("Enter the string: ")
count = len(re.findall("[a-zA-Z_]+", string))
if line == "Done": #command to terminate the loop
break
print (count)
print ("Terminated")
Aliyar answered 2020-01-03T02:38:06Z
4 votes
s = "I am having a very nice 23!@$ day. "
sum([i.strip(string.punctuation).isalpha() for i in s.split()])
上面的语句将遍历每个文本块,并删除标点符号,然后再验证该块是否真的是字母字符串。
boon kwee answered 2020-01-03T02:38:26Z
2 votes
好的,这是我执行此操作的版本。 我注意到,您希望输出为2685720591545645664512,这意味着您不希望计算特殊字符和数字。 所以这是正则表达式模式:
re.findall("[a-zA-Z_]+", string)
其中[a-zA-Z_]表示它将与beetwen a-z(小写)和A-Z(大写)的任何字符匹配。
关于空格。 如果要删除所有多余的空格,请执行以下操作:
string = string.rstrip().lstrip() # Remove all extra spaces at the start and at the end of the string
while " " in string: # While there are 2 spaces beetwen words in our string...
string = string.replace(" ", " ") # ... replace them by one space!
JadedTuna answered 2020-01-03T02:38:55Z
2 votes
用一个简单的循环来计算空格数的出现!!
txt = "Just an example here move along"
count = 1
for i in txt:
if i == " ":
count += 1
print(count)
Anto answered 2020-01-03T02:39:20Z
1 votes
def wordCount(mystring):
tempcount = 0
count = 1
try:
for character in mystring:
if character == " ":
tempcount +=1
if tempcount ==1:
count +=1
else:
tempcount +=1
else:
tempcount=0
return count
except Exception:
error = "Not a string"
return error
mystring = "I am having a very nice 23!@$ day."
print(wordCount(mystring))
输出是8
Darrell White answered 2020-01-03T02:39:39Z
0 votes
import string
sentence = "I am having a very nice 23!@$ day. "
# Remove all punctuations
sentence = sentence.translate(str.maketrans('', '', string.punctuation))
# Remove all numbers"
sentence = ''.join([word for word in sentence if not word.isdigit()])
count = 0;
for index in range(len(sentence)-1) :
if sentence[index+1].isspace() and not sentence[index].isspace():
count += 1
print(count)
Adam answered 2020-01-03T02:39:55Z