python 创建文本文件指定每行字符数_关于python：计算文件每一行每个单词中的字符数...-CSDN博客

本文链接：https://blog.csdn.net/weixin_42315701/article/details/112896876

这篇博客展示了如何使用Python读取文本文件并计算每行的字符数。通过逐行处理文件，利用内置的len函数，可以轻松获取每行的字符数量。代码示例中还包括了对总行数、总字符数和总单词数的计算。

摘要由CSDN通过智能技术生成

此代码将打印文本文件中的总行数，单词总数和字符总数。它工作正常，并提供了预期的输出。但是我想计算每行中的字符数并像这样打印：-

Line No. 1 has 58 Characters

Line No. 2 has 24 Characters

代码：-

import string

def fileCount(fname):

#counting variables

lineCount = 0

wordCount = 0

charCount = 0

words = []

#file is opened and assigned a variable

infile = open(fname, 'r')

#loop that finds the number of lines in the file

for line in infile:

lineCount = lineCount + 1

word = line.split()

words = words + word

#loop that finds the number of words in the file

for word in words:

wordCount = wordCount + 1

#loop that finds the number of characters in the file

for char in word:

charCount = charCount + 1

#returns the variables so they can be called to the main function

return(lineCount, wordCount, charCount)

def main():

fname = input('Enter the name of the file to be used: ')

lineCount, wordCount, charCount = fileCount(fname)

print ("There are", lineCount,"lines in the file.")

print ("There are", charCount,"characters in the file.")

print ("There are", wordCount,"words in the file.")

main()

如

for line in infile:

lineCount = lineCount + 1

正在计算整条线，但是如何进行此操作的每一条线呢？

我正在使用Python 3.X

您可以使用len函数。

但是len也会计算空格和制表符。另外，如何将其应用于每一行？我需要另一个循环。

len(re.findall(r\S, line))

不需要为此使用正则表达式

Python具有一个超级有用的内置collections.Counter，这是对输入进行计数的专用指令。看我的答案。代码更短，性能更高，因为无需迭代地添加到列表words

将所有信息存储在字典中，然后按键访问。

def fileCount(fname):

#counting variables

d = {"lines":0,"words": 0,"lengths":[]}

#file is opened and assigned a variable

with open(fname, 'r') as f:

for line in f:

# split into words

spl = line.split()

# increase count for each line

d["lines"] += 1

# add length of split list which will give total words

d["words"] += len(spl)

# get the length of each word and sum

d["lengths"].append(sum(len(word) for word in spl))

return d

def main():

fname = input('Enter the name of the file to be used: ')

data = fileCount(fname)

print ("There are {lines} lines in the file.".format(**data))

print ("There are {} characters in the file.".format(sum(data["lengths"])))

print ("There are {words} words in the file.".format(**data))

# enumerate over the lengths, outputting char count for each line

for ind, s in enumerate(data["lengths"], 1):

print("Line: {} has {} characters.".format(ind, s))

main()

该代码仅适用于由空格分隔的单词，因此您需要牢记这一点。

collections.Counter是一个特殊的字典，它计算其输入。

定义要计数的允许字符的set，然后可以使用len获取大部分数据。

在下面，我选择了字符集：

['！'，'""，'＃'，'$'，'％'，'＆'，''''，'('，')'，'*'，'+'，'，'，' -'，'。'，'/'，'0'，'1'，'2'，'3'，'4'，'5'，'6'，'7'，'8'，'9' ，'：'，';'，''，'？'，'@'，'A'，'B'，'C'，'D'，'E'，' F'，'G'，'H'，'I'，'J'，'K'，'L'，'M'，'N'，'O'，'P'，'Q'，'R' ，" S"，" T"，" U"，" V"，" W"，" X"，" Y"，" Z"，" ["，" "，"]"，" ^"，" _'，'`'，'a'，'b'，'c'，'d'，'e'，'f'，'g'，'h'，'i'，'j'，'k' ，" l"，" m"，" n"，" o"，" p"，" q"，" r"，" s"，" t"，" u"，" v"，" w"，" x'，'y'，'z'，'{'，'|'，'}'，'?']

#Define desired character set

valid_chars = set([chr(i) for i in range(33,127)])

total_lines = total_words = total_chars = 0

line_details = []

with open ('test.txt', 'r') as f:

for line in f:

total_lines += 1

line_char_count = len([char for char in line if char in valid_chars])

total_chars += line_char_count

total_words += len(line.split())

line_details.append("Line %d has %d characters" % (total_lines, line_char_count))

print ("There are", total_lines,"lines in the file.")

print ("There are", total_chars,"characters in the file.")

print ("There are", total_words,"words in the file.")

for line in line_details:

print (line)

这是使用内置collections.Counter的更简单版本，它是对输入进行计数的专用字典。我们可以使用Counter.update()方法在每一行中都包含所有单词(无论是否唯一)：

from collections import Counter

def file_count_2(fname):

line_count = 0

word_counter = Counter()

infile = open(fname, 'r')

for line in infile:

line_count += 1

word_counter.update( line.split() )

word_count = 0

char_count = 0

for word, cnt in word_counter.items():

word_count += cnt

char_count += cnt * len(word)

print(word_counter)

return line_count, word_count, char_count

笔记：

我对此进行了测试，它为您的代码提供了相同的计数

因为您不必迭代地追加到列表words(最好只对唯一的单词进行散列并存储其计数，这是Counter的工作)，所以它会更快，并且也不需要每次都迭代和递增charCount我们看到一个单词的出现。

如果只希望word_count而不是char_count，则可以直接使用word_count = sum(word_counter.values())，而无需遍历word_counter

PS命名word_count，line_count等比wordCount，lineCount具有更多Pythonic(PEP-8格式)；我们仅使用CamelCase作为类名，而不使用变量，函数或方法。

尽管此答案可能比原始代码更有效，但它并未回答"如何计算和打印每行中的字符数"的问题。

@RolfofSaxony：实际上，按照OP的原始标题和代码示例。标题编辑是我的，而不是他们的，试图抓住他们的意图。现在，我已对其进行修复，以使"每行的每个单词"而不是"每行的每个单词"更清晰

问题："第1行有58个字符。第2行有24个字符"？

@RolfofSaxony：啊，我将OP的代码作为他们想要的规范，并对其进行了清理。但是他们希望将其扩展到每一行中的计数。让我更正我的代码...

请注意，注释中应排除空格和制表符

@RolfofSaxony：是的，几天前我发表了类似的评论

我被分配了创建一个程序来打印一行中的字符数的任务。

作为编程的菜鸟，我发现这很困难:(。

这是我想出的，以及他的回应-

这是您程序的核心部分：

with open ('data_vis_tips.txt', 'r') as inFile:

with open ('count_chars_per_line.txt', 'w') as outFile:

chars = 0

for line in inFile:

line = line.strip('

chars = len(line)

outFile.write(str(len(line))+'

可以简化为：

with open ('data_vis_tips.txt', 'r') as inFile:

for line in inFile:

line = line.strip()

num_chars = len(line)

print(num_chars)

请注意，不需要strip()函数的参数；默认情况下会去除空格，而' n'是空格。