当使用正则表达式划分字符串时,有一种正则匹配是利用\s匹配空白字符
re.split(r'[;,\s]', line) # \s 匹配任何空白字符,包括空格、制表符、换页符等等。等价于 [ \f\n\r\t\v]
但是如果如上编写,在分割如下字符串时的结果是这样的:
line = 'asdf fjhk; ijhi, acdks,khcvds, foo'
re.split(r'[;,\s]', line)
>>['asdf', 'fjhk', 'ijhi', 'acdks', 'khcvds', '','','foo']
这是因为正则只匹配了一位空白字符,将代码稍作更改即可得到更规范的结果:
line = 'asdf fjhk; ijhi, acdks,khcvds, foo'
re.split(r'[;,\s]\s*', line)
>>['asdf', 'fjhk', 'ijhi', 'acdks', 'khcvds', 'foo']