python 逗号分隔_在python中拆分逗号分隔的字符串

如果有递归嵌套表达式,可以在逗号上拆分,并验证它们是否与pyparsing匹配:import pyparsing as pp

def CommaSplit(txt):

''' Replicate the function of str.split(',') but do not split on nested expressions or in quoted strings'''

com_lok=[]

comma = pp.Suppress(',')

# note the location of each comma outside an ignored expression:

comma.setParseAction(lambda s, lok, toks: com_lok.append(lok))

ident = pp.Word(pp.alphas+"_", pp.alphanums+"_") # python identifier

ex1=(ident+pp.nestedExpr(opener='')) # Ignore everthing inside nested '< >'

ex2=(ident+pp.nestedExpr()) # Ignore everthing inside nested '( )'

ex3=pp.Regex(r'("|\').*?\1') # Ignore everything inside "'" or '"'

atom = ex1 | ex2 | ex3 | comma

expr = pp.OneOrMore(atom) + pp.ZeroOrMore(comma + atom )

try:

result=expr.parseString(txt)

except pp.ParseException:

return [txt]

else:

return [txt[st:end] for st,end in zip([0]+[e+1 for e in com_lok],com_lok+[len(txt)])]

tests='''\

obj<1, 2, 3>, x(4, 5), "msg, with comma"

nesteobj<1, sub<6, 7>, 3>, nestedx(4, y(8, 9), 5), "msg, with comma"

nestedobj<1, sub<6, 7>, 3>, nestedx(4, y(8, 9), 5), 'msg, with comma', additional<1, sub<6, 7>, 3>

bare_comma<1, sub(6, 7), 3>, x(4, y(8, 9), 5), , 'msg, with comma', obj<1, sub<6, 7>, 3>

bad_close<1, sub<6, 7>, 3), x(4, y(8, 9), 5), 'msg, with comma', obj<1, sub<6, 7>, 3)

'''

for te in tests.splitlines():

result=CommaSplit(te)

print(te,'==>\n\t',result)

印刷品:

^{pr2}$

当前行为就像'(something does not split), b, "in quotes", c'.split','),包括保留前导空格和引号。从字段中去掉引号和前导空格很简单。在

将try下的else更改为:else:

rtr = [txt[st:end] for st,end in zip([0]+[e+1 for e in com_lok],com_lok+[len(txt)])]

if strip_fields:

rtr=[e.strip().strip('\'"') for e in rtr]

return rtr

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值