python求txt文件内平均值_如何使用python计算几个.dat文件的平均值?

该博客介绍了一种使用Python高效地计算多个.dat文件中数值的平均值和标准差的方法。通过并行读取文件和迭代行,计算平均值,同时应用Bessel's校正来得到样本标准差。最后,结果被输出到'means.txt'文件中。
摘要由CSDN通过智能技术生成

这是一种相当时间和资源有效的方法,它读取值并并行计算所有文件的平均值,但每次只读取一行文件 – 但它会暂时读取整个第一个.dat文件进入内存以确定每个文件中将有多少行和每列数字.

你没有说你的“数字”是整数还是浮点数或什么,所以这将它们作为浮点读取(即使它们不存在也会起作用).无论如何,平均值被计算并输出为浮点数.

更新

我已经修改了我的原始答案,还根据您的评论计算了每行和每列中值的总体标准差(西格玛).它在计算它们的平均值之后立即执行此操作,因此不需要再次读取所有数据.此外,为了响应注释中的建议,添加了上下文管理器以确保关闭所有输入文件.

请注意,标准偏差仅打印并且不会写入输出文件,但对相同或单独的文件执行此操作应该很容易添加.

from contextlib import contextmanager

from itertools import izip

from glob import iglob

from math import sqrt

from sys import exit

@contextmanager

def multi_file_manager(files, mode='rt'):

files = [open(file, mode) for file in files]

yield files

for file in files:

file.close()

# generator function to read, convert, and yield each value from a text file

def read_values(file, datatype=float):

for line in file:

for value in (datatype(word) for word in line.split()):

yield value

# enumerate multiple egual length iterables simultaneously as (i, n0, n1, ...)

def multi_enumerate(*iterables, **kwds):

start = kwds.get('start', 0)

return ((n,)+t for n, t in enumerate(izip(*iterables), start))

DATA_FILE_PATTERN = 'data*.dat'

MIN_DATA_FILES = 2

with multi_file_manager(iglob(DATA_FILE_PATTERN)) as datfiles:

num_files = len(datfiles)

if num_files < MIN_DATA_FILES:

print('Less than {} .dat files were found to process, '

'terminating.'.format(MIN_DATA_FILES))

exit(1)

# determine number of rows and cols from first file

temp = [line.split() for line in datfiles[0]]

num_rows = len(temp)

num_cols = len(temp[0])

datfiles[0].seek(0) # rewind first file

del temp # no longer needed

print '{} .dat files found, each must have {} rows x {} cols\n'.format(

num_files, num_rows, num_cols)

means = []

std_devs = []

divisor = float(num_files-1) # Bessel's correction for sample standard dev

generators = [read_values(file) for file in datfiles]

for _ in xrange(num_rows): # main processing loop

for _ in xrange(num_cols):

# create a sequence of next cell values from each file

values = tuple(next(g) for g in generators)

mean = float(sum(values)) / num_files

means.append(mean)

means_diff_sq = ((value-mean)**2 for value in values)

std_dev = sqrt(sum(means_diff_sq) / divisor)

std_devs.append(std_dev)

print 'Average and (standard deviation) of values:'

with open('means.txt', 'wt') as averages:

for i, mean, std_dev in multi_enumerate(means, std_devs):

print '{:.2f} ({:.2f})'.format(mean, std_dev),

averages.write('{:.2f}'.format(mean)) # note std dev not written

if i % num_cols != num_cols-1: # not last column?

averages.write(' ') # delimiter between values on line

else:

print # newline

averages.write('\n')

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值