python 创建文本文件指定每行字符数_关于python:计算文件每一行每个单词中的字符数...

这篇博客展示了如何使用Python读取文本文件并计算每行的字符数。通过逐行处理文件,利用内置的len函数,可以轻松获取每行的字符数量。代码示例中还包括了对总行数、总字符数和总单词数的计算。
摘要由CSDN通过智能技术生成

此代码将打印文本文件中的总行数,单词总数和字符总数。 它工作正常,并提供了预期的输出。 但是我想计算每行中的字符数并像这样打印:-

Line No. 1 has 58 Characters

Line No. 2 has 24 Characters

代码:-

import string

def fileCount(fname):

#counting variables

lineCount = 0

wordCount = 0

charCount = 0

words = []

#file is opened and assigned a variable

infile = open(fname, 'r')

#loop that finds the number of lines in the file

for line in infile:

lineCount = lineCount + 1

word = line.split()

words = words + word

#loop that finds the number of words in the file

for word in words:

wordCount = wordCount + 1

#loop that finds the number of characters in the file

for char in word:

charCount = charCount + 1

#returns the variables so they can be called to the main function

return(lineCount, wordCount, charCount)

def main():

fname = input('Enter the name of the file to be used: ')

lineCount, wordCount, charCount = fileCount(fname)

print ("There are", lineCount,"lines in the file.")

print ("There are", charCount,"characters in the file.")

print ("There are", wordCount,"words in the file.")

main()

for line in infile:

lineCount = lineCount + 1

正在计算整条线,但是如何进行此操作的每一条线呢?

我正在使用Python 3.X

您可以使用len函数。

但是len也会计算空格和制表符。 另外,如何将其应用于每一行? 我需要另一个循环。

len(re.findall(r\S, line))

不需要为此使用正则表达式

Python具有一个超级有用的内置collections.Counter,这是对输入进行计数的专用指令。 看我的答案。 代码更短,性能更高,因为无需迭代地添加到列表words

将所有信息存储在字典中,然后按键访问。

def fileCount(fname):

#counting variables

d = {"lines":0,"words": 0,"lengths":[]}

#file is opened and assigned a variable

with open(fname, 'r') as f:

for line in f:

# split into words

spl = line.split()

# increase count for each line

d["lines"] += 1

# add length of split list which will give total words

d["words"] += len(spl)

# get the length of each word and sum

d["lengths"].append(sum(len(word) for word in spl))

return d

def main():

fname = input('Enter the name of the file to be used: ')

data = fileCount(fname)

print ("There are {lines} lines in the file.".format(**data))

print ("There are {} characters in the file.".format(sum(data["lengths"])))

print ("There are {words} words in the file.".format(**data))

# enumerate over the lengths, outputting char count for each line

for ind, s in enumerate(data["lengths"], 1):

print("Line: {} has {} characters.".format(ind, s))

main()

该代码仅适用于由空格分隔的单词,因此您需要牢记这一点。

collections.Counter是一个特殊的字典,它计算其输入。

定义要计数的允许字符的set,然后可以使用len获取大部分数据。

在下面,我选择了字符集:

['!','"",'#','$','%','&','''','(',')','*','+',',',' -','。','/','0','1','2','3','4','5','6','7','8','9' ,':',';','','?','@','A','B','C','D','E',' F','G','H','I','J','K','L','M','N','O','P','Q','R' ," S"," T"," U"," V"," W"," X"," Y"," Z"," ["," ","]"," ^"," _','`','a','b','c','d','e','f','g','h','i','j','k' ," l"," m"," n"," o"," p"," q"," r"," s"," t"," u"," v"," w"," x','y','z','{','|','}','?']

#Define desired character set

valid_chars = set([chr(i) for i in range(33,127)])

total_lines = total_words = total_chars = 0

line_details = []

with open ('test.txt', 'r') as f:

for line in f:

total_lines += 1

line_char_count = len([char for char in line if char in valid_chars])

total_chars += line_char_count

total_words += len(line.split())

line_details.append("Line %d has %d characters" % (total_lines, line_char_count))

print ("There are", total_lines,"lines in the file.")

print ("There are", total_chars,"characters in the file.")

print ("There are", total_words,"words in the file.")

for line in line_details:

print (line)

这是使用内置collections.Counter的更简单版本,它是对输入进行计数的专用字典。我们可以使用Counter.update()方法在每一行中都包含所有单词(无论是否唯一):

from collections import Counter

def file_count_2(fname):

line_count = 0

word_counter = Counter()

infile = open(fname, 'r')

for line in infile:

line_count += 1

word_counter.update( line.split() )

word_count = 0

char_count = 0

for word, cnt in word_counter.items():

word_count += cnt

char_count += cnt * len(word)

print(word_counter)

return line_count, word_count, char_count

笔记:

我对此进行了测试,它为您的代码提供了相同的计数

因为您不必迭代地追加到列表words(最好只对唯一的单词进行散列并存储其计数,这是Counter的工作),所以它会更快,并且也不需要每次都迭代和递增charCount我们看到一个单词的出现。

如果只希望word_count而不是char_count,则可以直接使用word_count = sum(word_counter.values()),而无需遍历word_counter

PS命名word_count,line_count等比wordCount,lineCount具有更多Pythonic(PEP-8格式);我们仅使用CamelCase作为类名,而不使用变量,函数或方法。

尽管此答案可能比原始代码更有效,但它并未回答"如何计算和打印每行中的字符数"的问题。

@RolfofSaxony:实际上,按照OP的原始标题和代码示例。标题编辑是我的,而不是他们的,试图抓住他们的意图。现在,我已对其进行修复,以使"每行的每个单词"而不是"每行的每个单词"更清晰

问题:"第1行有58个字符。第2行有24个字符"?

@RolfofSaxony:啊,我将OP的代码作为他们想要的规范,并对其进行了清理。但是他们希望将其扩展到每一行中的计数。让我更正我的代码...

请注意,注释中应排除空格和制表符

@RolfofSaxony:是的,几天前我发表了类似的评论

我被分配了创建一个程序来打印一行中的字符数的任务。

作为编程的菜鸟,我发现这很困难:(。

这是我想出的,以及他的回应-

这是您程序的核心部分:

with open ('data_vis_tips.txt', 'r') as inFile:

with open ('count_chars_per_line.txt', 'w') as outFile:

chars = 0

for line in inFile:

line = line.strip('

')

chars = len(line)

outFile.write(str(len(line))+'

')

可以简化为:

with open ('data_vis_tips.txt', 'r') as inFile:

for line in inFile:

line = line.strip()

num_chars = len(line)

print(num_chars)

请注意,不需要strip()函数的参数;默认情况下会去除空格,而' n'是空格。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值