csv python 只写一次_Python只为CSV-fi写入一行

最新推荐文章于 2022-04-23 20:40:09 发布

weixin_39576127

最新推荐文章于 2022-04-23 20:40:09 发布

阅读量382

点赞数

文章标签： csv python 只写一次

本文链接：https://blog.csdn.net/weixin_39576127/article/details/113381072

版权

我很抱歉再次提出这个问题，但是这个问题还没有解决。在

这不是一个非常复杂的问题，我确信这是相当直截了当的，但我根本看不到这个问题。在

我用来解析XML文件的代码是打开的，并以我想要的格式读取——最后一个for循环中的print语句证明了这一点。在

例如，它输出以下内容：Pivoting support handle D0584129 20090106 US

Hinge D0584130 20090106 US

Deadbolt turnpiece D0584131 20090106 US

这正是我希望我的数据写入CSV文件的方式。但是，当我试图将这些作为行写入CSV本身时，它只会打印XML文件中最后一行中的一行，并以这种方式：Flashlight package,D0584138,20090106,US

以下是我的全部代码，因为它可能有助于理解整个过程，其中感兴趣的区域是分隔的\u xml中的for xml字符串的起始位置：from bs4 import BeautifulSoup

import csv

import unicodecsv as csv

infile = "C:\\Users\\Grisha\\Documents\\Inventor\\2009_Data\\Jan\\ipg090106.xml"

# The first line of code defines a function "separated_xml" that will allow us to separate, read, and then finally parse the data of interest with

def separated_xml(infile): # Defining the data reading function for each xml section - This breaks apart the xml from the start (root element <?xml... ) to the next iteration of the root element

file = open(infile, "r") # Used to open the xml file

buffer = [file.readline()] # Used to read each line and placing inside vector

# The first for-loop is used to slice every section of the USPTO XML file to be read and parsed individually

# It is necessary because Python wishes to read only one instance of a root element but this element is found many times in each file which causes reading errors

for line in file: # Running for-loop for the opened file and searches for root elements

if line.startswith("<?xml "):

yield "".join(buffer) # 1) Using "yield" allows to generate one instance per run of a root element and 2) .join takes the list (vector) "buffer" and connects an empty string to it

buffer = [] # Creates a blank list to store the beginning of a new 'set' of data in beginning with the root element

buffer.append(line) # Passes lines into list

yield "".join(buffer) # Outputs

file.close()

# The second nested set of for-loops are used to parse the newly reformatted data into a new list

for xml_string in separated_xml(infile): # Calls the output of the separated and read file to parse the data

soup = BeautifulSoup(xml_string, "lxml") # BeautifulSoup parses the data strings where the XML is converted to Unicode

pub_ref = soup.findAll("publication-reference") # Beginning parsing at every instance of a publication

lst = [] # Creating empty list to append into

with open('./output.csv', 'wb') as f:

writer = csv.writer(f, dialect = 'excel')

for info in pub_ref: # Looping over all instances of publication

# The final loop finds every instance of invention name, patent number, date, and country to print and append into

for inv_name, pat_num, date_num, country in zip(soup.findAll("invention-title"), soup.findAll("doc-number"), soup.findAll("date"), soup.findAll("country")):

print(inv_name.text, pat_num.text, date_num.text, country.text)

lst.append((inv_name.text, pat_num.text, date_num.text, country.text))

writer.writerow([inv_name.text, pat_num.text, date_num.text, country.text])

我也尝试过将open和writer放在for循环之外，以检查哪里出现了问题，但是没有用。我知道这个文件一次只写一行，并且一遍又一遍地重写同一行(这就是为什么CSV文件中只剩下1行)，我就是看不到它。在

提前谢谢你的帮助。在

weixin_39576127

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
csv python 只写一次_Python只为CSV-fi写入一行

我很抱歉再次提出这个问题，但是这个问题还没有解决。在这不是一个非常复杂的问题，我确信这是相当直截了当的，但我根本看不到这个问题。在我用来解析XML文件的代码是打开的，并以我想要的格式读取——最后一个for循环中的print语句证明了这一点。在例如，它输出以下内容：Pivoting support handle D0584129 20090106 USHinge D0584130 20090106 ...
复制链接

扫一扫