python字符串处理

1、用多个界定符分割字符串:
>>> import re
>>> line = 'a f; a, f,a, f'
>>> re.split(r'[;,\s]\s*', line)#分隔符可以是逗号,分号或者是空格,并且后面紧跟着任意个的空格
['a', 'f', 'a', 'f', 'a', 'f']

2、字符串开头或结尾匹配:
>>> s= 'http://www.abcd.asp'
>>> s.endswith('.asp')
True
>>> s.startswith('http:')
True

3、检查某个文件夹中是否存在指定的文件类型:
>>>any(name.endswith(('.txt', '.py')) for name inlistdir(dirname))
True # false

4、用 Shell 通配符匹配字符串:
>>> names = ['Dat1.csv', 'Dat2.csv', 'config.ini','foo.py']
>>> [name for name in names if fnmatch(name,'Dat*.csv')]
['Dat1.csv', 'Dat2.csv']

5、字符串匹配和搜索:
>>> text = 'yeah, but no, but yeah, but no, butyeah'
>>> text.startswith('yeah')
True
>>> text.endswith('yeah')
True
>>> text.find('no') #从0开始计算字符串no首次出现位置
10
# match() 总是从字符串开始去匹配
>>> date = '11/27/2012'
>>> import re
>>> re.match(r'\d+/\d+/\d+', date)
True
>>> datepat = re.compile(r'\d+/\d+/\d+')
>>> datepat.match(date)
True
#查找字符串任意部分的模式出现位置,使用 findall()或 finditer()
>>> text = 'Today is 11/27/2012. PyCon starts3/13/2013.'
>>> datepat = re.compile(r'\d+/\d+/\d+')
>>> datepat.findall(text)
['11/27/2012', '3/13/2013']
>>> for m in datepat.finditer(text):
 print(m.groups())
('11', '27', '2012')
('3', '13', '2013')

6、字符串替换:
>>> text = 'yeah, but no, but yeah, but no, butyeah'
>>> text.replace('yeah', 'yep')
'yep, but no, but yep, but no, but yep'
>>> import re
>>> text = 'Today is 11/27/2012. PyCon starts3/13/2013.'
>>> re.sub(r'(\d+)/(\d+)/(\d+)', r'\3-\1-\2', text)
'Today is 2012-11-27. PyCon starts 2013-3-13.'
#忽略大小写的搜索、替换
>>> text = 'UPPER PYTHON, lower python, MixedPython'
>>> re.findall('python', text, flags=re.IGNORECASE)
['PYTHON', 'python', 'Python']
>>> re.sub('python', 'snake', text,flags=re.IGNORECASE)
'UPPER snake, lower snake, Mixed snake'

7、最短匹配模式:
>>> text2 = 'Computer says "no." Phone says "yes."'
>>> str_pat = re.compile(r'\"(.*?)\"')
>>> str_pat.findall(text2)
['no.', 'yes.']

8、多行匹配模式:
>>> text2 = '''
 '''
>>> comment = re.compile(r'/\*(.*?)\*/', re.DOTALL)
>>> comment.findall(text2)
[' this is a\n multiline comment ']

9、删除字符串中不需要的字符:
>>> s = ' hello world \n'
>>> s.strip()
'hello world'
>>> s.lstrip()
'hello world \n'
>>> s.rstrip()
' hello world'

10、字符串对齐:
>>> text = 'Hello World'
>>> text.ljust(20)
'Hello World '
>>> text.rjust(20)
' Hello World'
>>> text.center(20)
' Hello World '
>>> text.rjust(20,'=')
'=========Hello World'
>>> text.center(20,'*')
'****Hello World*****'

11、合并拼接字符串:
>>> parts = ['Is', 'Chicago', 'Not', 'Chicago?']
>>> ' '.join(parts)
'Is Chicago Not Chicago?'
>>> ','.join(parts)
'Is,Chicago,Not,Chicago?'
>>> ''.join(parts)
'IsChicagoNotChicago?'
>>> a = 'Is Chicago'
>>> b = 'Not Chicago?'
>>> a + ' ' + b
'Is Chicago Not Chicago?'

12、字符串中插入变量:
>>> s = '{name} has {n} messages.'
>>> s.format(name='Guido', n=37)
'Guido has 37 messages.'
>>> name = 'Guido'
>>> n = 37
>>> '%(name) has %(n) messages.' % vars()
'Guido has 37 messages.'

13、在字符串中处理 html 和 xml:
>>> import html
>>> s = 'Elements are written as "text".'
>>> print(html.escape(s))
Elements are written as "text".
>>> s = 'Spicy "Jalapeño&quot.'
>>> from html.parser import HTMLParser
>>> p = HTMLParser()
>>> p.(s)
'Spicy "Jalape?o".'
>>>
>>> t = 'The prompt is >>>'
>>> from xml.sax.saxutils import
>>> (t)
'The prompt is >>>'
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值