csv 单引号 数字_Python CSV:逗号,列内的单引号和双引号

我正在尝试使用DictWriter编写csv文件,但是像这样的列:

2,2',2“-(六氢-1,3,5-三嗪-1,3,5-三基)三乙醇| 1,3,5-三(2-羟乙基)六氢-1,3,5-三嗪

破坏一切。标头是:

"#","Index no.","EC / List no.","CAS no.","Name","Page ID","Link"

上面的列应该在Name列中,但是在这里,当我尝试编写此行时得到了什么:

OrderedDict([('\ufeff "#"', '756'), ('Index no.', '613-114-00-6'),

('EC / List no.', '225-208-0'), ('CAS no.', '4719-04-4'),

# most of the following should be the value to 'Name'

# `PageId` should be '122039' and 'Link' should be the 'https...' text

('Name', "2,2',2-(hexahydro-1"), ('Page ID', '3'),

('Link', '5-triazine-1'),

(None, ['3', '5-triyl)triethanol|1', '3',

'5-tris(2-hydroxyethyl)hexahydro-1', '3',

'5-triazine"', '122039',

'https://echa.europa.eu/information-on-chemicals/cl-inventory-database/-/discli/details/122039'])

我尝试了DictWriter参数的所有可能组合

quotechar='"', doublequote=False, delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True, escapechar='\\'

没有任何帮助。

最小,完整和可验证的示例

old.csv

"#","Index no.","EC / List no.","CAS no.","Name","Page ID"

"756","613-114-00-6","225-208-0","4719-04-4","2,2',2"-(hexahydro-1,3,5-triazine-1,3,5-triyl)triethanol|1,3,5-tris(2-hydroxyethyl)hexahydro-1,3,5-triazine","122039"

码:

import csv

with open('old.csv') as f, open('new.csv', 'w') as ff:

reader = csv.DictReader(f)

result = csv.DictWriter(ff, fieldnames=reader.fieldnames)

for line in reader:

result.writerow(line)

解决方案

您的old.csv格式错误-无法"正确转义(也不翻倍):

"756","613-114-00-6","225-208-0","4719-04-4","2,2',2"-(hexahydro-1,3,5-triazine-1,3,5-triyl)triethanol|1,3,5-tris(2-hydroxyethyl)hexahydro-1,3,5-triazine","122039"

----------------------------------------------------^ here is the not escaped "

该行应如下所示:

"756","613-114-00-6","225-208-0","4719-04-4","2,2',2\"-(hexahydro-1,3,5-triazine-1,3,5-triyl)triethanol|1,3,5-tris(2-hydroxyethyl)hexahydro-1,3,5-triazine","122039","https://echa.europa.eu/information-on-chemicals/cl-inventory-database/-/discli/details/122039"

----------------------------------------------------^^ escaped "

使用doublequote=True将需要将"字段内部加倍:"tata""tata"对于tata"tata-您的源数据不会:加倍或转义。

这可以完美地工作:

from collections import OrderedDict

fieldn = ["#","Index no.","EC / List no.","CAS no.","Name","Page ID","Link"]

od = OrderedDict(

[('#', '756'), ('Index no.', '613-114-00-6'),

('EC / List no.', '225-208-0'), ('CAS no.', '4719-04-4'),

('Name', '''2,2',2"-(hexahydro-1,3,5-triazine-1,3,5-triyl)triethanol|1,3,5-tris(2-hydroxyethyl)hexahydro-1,3,5-triazine'''),

('Page ID', '122039'),

('Link', 'https://echa.europa.eu/information-on-chemicals/cl-inventory-database/-/discli/details/122039')])

print(od) # see: Input to writer:

import csv

# write the ordered dict

with open("file.txt", "w",newline = "") as f:

writer = csv.DictWriter(f, quotechar='"', doublequote=False, delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True, escapechar= '\\', fieldnames=fieldn)

writer.writeheader() # remove if you do not want the header in as well

writer.writerow(od)

# read it back in and print it

with open ("file.txt") as r:

reader = csv.DictReader(r, quotechar='"', doublequote=False, delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True, escapechar= '\\', fieldnames=fieldn)

for row in reader:

print(row) # see Output after reading in written stuff

输入给作者:

OrderedDict([('#', '756'), ('Index no.', '613-114-00-6'), ('EC / List no.', '225-208-0'), ('CAS no.', '4719-04-4'), ('Name', '2,2\',2"-(hexahydro-1,3,5-triazine-1,3,5-triyl)triethanol|1,3,5-tris(2-hydroxyethyl)hexahydro-1,3,5-triazine'), ('Page ID', '122039'), ('Link', 'https://echa.europa.eu/information-on-chemicals/cl-inventory-database/-/discli/details/122039')])

读完书面内容后的输出(也写入标头-因此是双倍输出):

OrderedDict([('#', '#'), ('Index no.', 'Index no.'), ('EC / List no.', 'EC / List no.'), ('CAS no.', 'CAS no.'), ('Name', 'Name'), ('Page ID', 'Page ID'), ('Link', 'Link')])

OrderedDict([('#', '756'), ('Index no.', '613-114-00-6'), ('EC / List no.', '225-208-0'), ('CAS no.', '4719-04-4'), ('Name', '2,2\',2"-(hexahydro-1,3,5-triazine-1,3,5-triyl)triethanol|1,3,5-tris(2-hydroxyethyl)hexahydro-1,3,5-triazine'), ('Page ID', '122039'), ('Link', 'https://echa.europa.eu/information-on-chemicals/cl-inventory-database/-/discli/details/122039')])

档案内容:

"#","Index no.","EC / List no.","CAS no.","Name","Page ID","Link"

"756","613-114-00-6","225-208-0","4719-04-4","2,2',2\"-(hexahydro-1,3,5-triazine-1,3,5-triyl)triethanol|1,3,5-tris(2-hydroxyethyl)hexahydro-1,3,5-triazine","122039","https://echa.europa.eu/information-on-chemicals/cl-inventory-database/-/discli/details/122039"

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值