python读取文件格式_python以某种格式读取文件(python read file in certain format)

最新推荐文章于 2023-07-05 08:41:18 发布

小胖娃

最新推荐文章于 2023-07-05 08:41:18 发布

阅读量574

点赞数

文章标签： python读取文件格式

本文链接：https://blog.csdn.net/weixin_28953937/article/details/113990387

版权

python以某种格式读取文件(python read file in certain format)

我有一些格式如下的文件：

36.1 37.1 A: Hi, how are you?

39.1 40.1 B: I am ok!

我使用numpy.loadtxt()以dtype = np.dtype([('start', '|S1'), ('end', 'f8'),('person','|S1'),('content','|S100')])

前3列很好，但字符串部分总是有问题：格式不匹配。我想这是因为每个发言者都说可变长度的单词。有谁知道解决这个问题的更好方法吗？

非常感谢，

I have files with a certain format as follows:

36.1 37.1 A: Hi, how are you?

39.1 40.1 B: I am ok!

I am using numpy.loadtxt() to read this file with dtype = np.dtype([('start', '|S1'), ('end', 'f8'),('person','|S1'),('content','|S100')])

The first 3 column is fine but the string part always has an issue: the format does not match. I guess it is because each speaker says variable length words. Does anyone know a better way to solve this?

many thanks,

原文：https://stackoverflow.com/questions/41965458

更新时间：2019-12-25 12:38

最满意答案

我会建议阅读文本手动没有numpy，只是遍历文件中的行。

with open("read.txt", "r") as infile:

chats = []

for i in infile:

data = i.split(":")

start, end, name, content = data[0].split(" ")[0], data[0].split(" ")[1], data[0].split(" ")[2], data[1].strip("\n")

chats.append([start, end, name, content])

打开文件并逐行读取，而开始，结束，名称和内容作为子列表追加到列表聊天中。

I would recommend reading the text manually without numpy and just iterating over the lines in the file.

with open("read.txt", "r") as infile:

chats = []

for i in infile:

data = i.split(":")

start, end, name, content = data[0].split(" ")[0], data[0].split(" ")[1], data[0].split(" ")[2], data[1].strip("\n")

chats.append([start, end, name, content])

The file is opened and read line by line, while start, end, name and content is appended as a sublist to the list chats.

2017-01-31

相关问答

您可以简单地解析CSV，进行必要的更改，然后重新写入。 (我没有测试过这个代码，但它应该是这样的) import csv

reader = csv.reader(open('IN.csv', 'r'))

writer = csv.writer(open('OUT.csv', 'w')

for row in reader:

# do stuff to the row here

# row is just a list of items

writer.writerow(row)

You cou

...

好吧，在玩了一下之后，我添加了这条线 from scapy.error import Scapy_Exception文件from scapy.error import Scapy_Exception中的Scapy_Exception并运行我的程序。如果我这样做，它现在捕获错误： try:

...: pkts = rdpcap('./ms_dns.enc')

...: except Scapy_Exception as msg:

...: print msg, "

...

我会建议阅读文本手动没有numpy，只是遍历文件中的行。 with open("read.txt", "r") as infile:

chats = []

for i in infile:

data = i.split(":")

start, end, name, content = data[0].split(" ")[0], data[0].split(" ")[1], data[0].split(" ")[2], data[1].strip("

...

当您使用csv文件时，您应该查看csv模块。我写了一个代码应该做的伎俩。此代码假定“良好数据”。如果您的数据集可能包含错误(例如列数小于13或数据行少于326)，则应进行一些更改。 (更改为符合Python 2.6.6) import csv

with open('mydata.csv') as in_file:

with open('outfile.csv', 'wb') as out_file:

csv_reader = csv.reader(in_file,

...

您可能希望将文本文件逐行读入列表中。然后，您可以将数据导出为 excel可以读取的.csv文件，也可以使用类似Openpyxl的库直接作为Excel文件导出。例如，如果您愿意生成.csv而不是Excel文件，则此代码执行您所要求的操作： fname = "" #path to file

csvname = "" #path to output csv

with open(fname) as f: #reads the file

content = f.readlines() #

...

感谢stovfl，我解决了这个问题。这里是结合np.fromfile和reshape的代码。 cmorph = np.fromfile(file,type=np.float32,count=nvar*ntim*nlat*mlon)).reshape((nvar*ntim,nlat,mlon))

Thanks to stovfl, I solved this. Here is the code that combines the np.fromfile and the reshape. cmor

...

你可以试试这个。 import json

with open("json_data.json", mode='r', encoding='utf-8') as json_data:

data = json.load(json_data)

print(data)

json_data.json {

"hello": 11,

"world": 22,

"json": 33

}

产量 {'hello': 11, 'world': 22, 'json': 33}

确保您的

...

这应该工作： In [47]: strs="4.500000+1 1.894719-3 4.600000+1 8.196721-3 4.700000+1 2.869539-3"

In [48]: [float(x.replace("+","e+").replace("-","e-")) for x in strs.split()]

Out[48]: [45.0, 0.001894719, 46.0, 0.008196721, 47.0, 0.002869539]

this should wo

...

在您显示之后阅读标题后，您将获得高度(1024)高度(下一个1024)和深度(255)。要获取像素数据，最简单的是逐字节读取它们： def read_pgm(pgmf):

"""Return a raster of integers from a PGM as a list of lists."""

assert pgmf.readline() == 'P5\n'

(width, height) = [int(i) for i in pgmf.readline().spl

...

你可以使用带分隔符的read_csv ][必须由\转义。然后replace列和值，并使用dropna删除所有NaN的dropna ： import pandas as pd

from pandas.compat import StringIO

temp=u"""[Header1][Header2][Header3][HeaderN]

[=======][=======][=======][=======]

[Value1][Value2][Value3][ValueN]

[AnotherVal

...

小胖娃

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫