python生成csv文件时逗号没有分割,Python无法正确分割带引号逗号的CSV档案

最新推荐文章于 2023-06-02 19:34:56 发布

3DSSQAS

最新推荐文章于 2023-06-02 19:34:56 发布

阅读量387

点赞数

文章标签： python生成csv文件时逗号没有分割

def csv_split() :

raw = [

'"1,2,3" , "4,5,6" , "456,789"',

'"text":"a,b,c,d", "gate":"456,789"'

]

cr = csv.reader( raw, skipinitialspace=True )

for l in cr :

print len( l ), l

This function outputs following:

3 ['1,2,3 ', '4,5,6 ', '456,789']

6 ['text:"a', 'b', 'c', 'd"', 'gate:"456', '789"']

As you can tell, the first line is correctly split into 3 entries.

But the second line is NOT. I would expect the csv reader splits it

into two, instead we've got 6 here. I have also thought about regex

approaches, but it assumes some specific quoting dialect.

Basically what I want is:

just split the string whenever there is a "," that is not quoted in a pair

of "".

Is there any quick and general way to do this? I have seen some regex hacks which

assumes that every filed is ALWAYS quoted etc. I think I can write a small loop

that does this very inefficiently, but would definitely appreciate some more

expertly advice. Thanks a lot!

解决方案

CSV isn't a standardized format, but it's common to escape quotation marks by using two "" if they appear inside the text (e.g. "text"":""a,b,c,d"). Python's CSV reader is doing the right thing here, because it assumes this convention. I'm not quite sure what do you expect as output, but here is my try for a very simple CSV reader which might suit your format. Feel free to adapt it accordingly.

raw = [

'"1,2,3" , "4,5,6" , "456,789"',

'"text":"a,b,c,d", "gate":"456,789"',

'1,2, 3,'

]

for line in raw:

i, quoted, row = 0, False, []

for j, c in enumerate(line):

if c == ',' and not quoted:

row.append(line[i:j].strip())

i = j + 1

elif c == '"':

quoted = not quoted

row.append(line[i:j+1].strip())

for i in range(len(row)):

if len(row[i]) >= 2 and row[i][0] == '"' and row[i][-1] == '"':

row[i] = row[i][1:-1] # remove quotation marks

print row

Output:

['1,2,3', '4,5,6', '456,789']

['text":"a,b,c,d', 'gate":"456,789']

['1', '2', '3', '']

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python生成csv文件时逗号没有分割,Python无法正确分割带引号逗号的CSV档案

def csv_split() :raw = ['"1,2,3" , "4,5,6" , "456,789"','"text":"a,b,c,d", "gate":"456,789"']cr = csv.reader( raw, skipinitialspace=True )for l in cr :print len( l ), lThis function outputs following:...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。