python实战打卡---day6_潍坊初中 python-CSDN博客

本文链接：https://blog.csdn.net/liang0502/article/details/125646101

第二部分python字符串和正则

字符串⽆所不在，字符串的处理也是最常见的操作。本章节将总结和字符串处理相关的⼀切操作。主要包括基本的字符串操作；⾼级字符串操作之正则。⽬前共有 25个⼩例⼦。

反转字符串

st='python'
# 方法一
''.join(reversed(st)) # 'nohtyp'
# 方法二
st[::-1] # 'nohtyp'

字符串切片操作

# 查找替换3或5的倍数
[str("java"[i%3*4:]+"python"[i%5*6:] or i) for i in range(1,15)]
'''
['1',
 '2',
 'java',
 '4',
 'python',
 'java',
 '7',
 '8',
 'java',
 'python',
 '11',
 'java',
 '13',
 '14']
'''

join串联字符串

my=['1','2','java']
','.join(my) # #⽤逗号连接字符串 '1,2,java'

字符串的字节长度

def str_byte_len(mystr):
    return (len(mystr.encode('utf-8')))
str_byte_len('i love python') # 13(个字节)
str_byte_len('字符') # 6(个字节)

以下是正则部分，需要引入re模块：

import re

查找第一个匹配串

s = 'i love python very much'
pat = 'python'
r = re.search(pat,s)
print(r.span()) #(7,13)

查找所有1的索引

s = '⼭东省潍坊市青州第1中学⾼三1班'
pat = '1'
r = re.finditer(pat,s)
for i in r:
    print(i)
'''
<_sre.SRE_Match object; span=(9, 10), match='1'>
<_sre.SRE_Match object; span=(14, 15), match='1'>
'''

\d匹配数字[0-9]

findall找出全部位置的所有匹配

s = '⼀共20⾏代码运⾏时间13.59s'
pat = r'\d+' # +表⽰匹配数字(\d表⽰数字的通⽤字符)1次或多次
r = re.findall(pat,s)
print(r)
'''
['20', '13', '59']
'''

匹配浮点数和整数

?表示前一个字符匹配0次或1次

s = '⼀共20⾏代码运⾏时间13.59s'
pat = r'\d+\.?\d+' # ?表⽰匹配⼩数点(\.)0次或1次，这种写法有个⼩bug，不能匹配到个位数的整数
r = re.findall(pat,s)
print(r)
'''
['20', '13.59']
'''

更好的写法：

pat = r'\d+\.\d+|\d+' # A|B，匹配A失败才匹配B['20', '13.59']

^匹配字符串的开头

s = 'This module provides regular expression matching operations similar to
those found in Perl'
pat = r'^[emrt]' # 查找以字符e,m,r或t开始的字符串
r = re.findall(pat,s)
print(r)
'''
[]
[],因为字符串的开头是字符T，不在emrt匹配范围内，所以返回为空
'''

s2 = 'email for me is guozhennianhua@163.com'
re.findall('^[emrt].*',s2)# 匹配以e,m,r,t开始的字符串，后⾯是多个任意字符
'''
['email for me is guozhennianhua@163.com']
'''

re.l忽略大小写

s = 'That'
pat = r't'
r = re.findall(pat,s,re.I) # ['T', 't']

理解compile的作用

如果要做很多次匹配，可以先编译匹配串：

import re
pat = re.compile('\W+') # \W 匹配不是数字和字母的字符
has_special_chars = pat.search('ed#2@edc')
if has_special_chars:
    print(f'str contains special characters:{has_special_chars.group(0)}')
'''
str contains special characters:#
'''

### 再次使⽤pat正则编译对象 做匹配
again_pattern = pat.findall('guozhennianhua@163.com')
if '@' in again_pattern:
    print('possibly it is an email')
'''
possibly it is an email
'''

使用()捕获单词，不想带空格

s = 'This module provides regular expression matching operations similar to those found in Perl'
pat = r'\s([a-zA-Z]+)'
r = re.findall(pat,s)
print(r)
'''
['module', 'provides', 'regular', 'expression', 'matching', 'operations', 'similar', 'to', 'those', 'found', 'in', 'Perl']
'''

看到提取单词中未包括第⼀个单词，使⽤ ? 表⽰前⾯字符出现0次或1次，但是此字符还有表⽰贪⼼或⾮贪⼼匹配含义，使⽤时要谨慎。

s = 'This module provides regular expression matching operations similar to those found in Perl'
pat = r'\s?([a-zA-Z]+)'
r = re.findall(pat,s)
print(r)
'''
['This', 'module', 'provides', 'regular', 'expression', 'matching', 'operations', 'similar', 'to', 'those', 'found', 'in', 'Perl']
'''

split分割单词

使⽤以上⽅法分割单词不是简洁的，仅仅是为了演⽰。分割单词最简单还是使⽤ split函数。

s = 'This module provides regular expression matching operations similar to those found in Perl'
pat = r'\s+'
r = re.split(pat,s)
print(r)
'''
['This', 'module', 'provides', 'regular', 'expression', 'matching', 'operations', 'similar', 'to', 'those', 'found', 'in', 'Perl']
'''

### 上⾯这句话也可直接使⽤str⾃带的split函数：
s.split(' ') #使⽤空格分隔
'''
['This',
 'module',
 'provides',
 'regular',
 'expression',
 'matching',
 'operations',
 'similar',
 'to',
 'those',
 'found',
 'in',
 'Perl']
'''

### 但是，对于风格符更加复杂的情况，split⽆能为⼒，只能使⽤正则
s = 'This,,, module ; \t provides|| regular ; '
words = re.split('[,\s;|]+',s) #这样分隔出来，最后会有⼀个空字符串
words = [i for i in words if len(i)>0] # ['This', 'module', 'provides', 'regular']