字符串与正则表达式习题_len('hello world!'.ljust(20))-CSDN博客

本文链接：https://blog.csdn.net/Zkang520/article/details/109693084

字符串与正则表达式

一、填空题
1、表达式 ‘abc’ in ‘abcdefg’ 的值为_True_，表达式 ‘abc’ in [‘abcdefg’] 的值为___False___。
2、 Python语句’’.join(list(‘hello world!’))执行的结果是’hello world!’。
3、已知列表对象x = [‘11’, ‘2’, ‘3’]，则表达式 max(x) 的值为’3’，min(x) 的值为’11’，max(x, key=len) 的值为’11’。
4、已知 path = r’c:\test.html’，那么表达式 path[:-4]+‘htm’ 的值为’c:\test.htm’。
5、表达式 str([1, 2, 3]) 的值为’[1, 2, 3]’，list(str([1,2,3])) == [1,2,3] 的值为False。
6、表达式 ‘%c’%65 的值为’A’，’%s’%65 的值为’65’， ‘%d,%c’ % (65, 65) 的值为’65,A’。
7、表达式 ‘The first:{1}, the second is {0}’.format(65,97) 的值为 ‘The first:{1}, the second is {0}’.format(65,97)。
8、表达式 ‘{0:#d},{0:#X},{0:#o}’.format(65) 的值为’65,0X41,0o101’。
9、表达式 isinstance(‘abcdefg’, str) 的值为__True____。
10、表达式 ‘abcabcabc’.rindex(‘abc’) 的值为__6__，‘abcabcabc’.count(‘abc’) 的值为_3____，‘I like Python’.rfind(‘python’) 的值为__-1__。

print("find(),找出'apple'第一个下标： ", x.find("apple"))
print("rfind(),找出'apple'最后一个下标： ", x.rfind("apple"))
print("find(),找不到就返回-1： ", x.find("xxx"))
print("index(),找出'apple'第一个下标： ", x.index("apple"))
print("rindex(),找出'apple'最后一个下标： ", x.rindex("apple"))
print("index(),找不到就抛出异常： ", x.index("xxx"))

11、表达式 ‘apple.peach,banana,pear’.find(‘p’) 的值为____1_____， ‘apple.peach,banana,pear’.find(‘ppp’) 的值为__-1___。
12、表达式 ‘:’.join(‘1,2,3,4,5’.split(’,’)) 的值为’1:2:3:4:5’。
13、表达式 ‘,’.join(‘a b ccc\n\n\nddd ‘.split()) 的值为’a,b,ccc,ddd’。
14、表达式 ‘Hello world’.upper() 的值为’HELLO WORLD’， ‘Hello world’.lower() 的值为’hello world’，‘Hello world’.swapcase()的值为’hELLO WORLD’。
15、已知s=r’c:\windows\notepad.exe’，表达式 s.endswith((’.com’, ‘.exe’)) 的值为True，s.startswith(‘c:’) 的值为True。
16、表达式 len(‘Hello world!’.ljust(20)) 的值为20，len(‘abcdefg’.rjust(3)) 的值为7。

S.ljust(width,[fillchar])
#输出width个字符，S左对齐，不足部分用fillchar填充，默认的为空格。
S.rjust(width,[fillchar]) #右对齐
S.center(width, [fillchar]) #中间对齐
S.zfill(width) #把S变成width长，并在右对齐，不足部分用0补足

17、已知 x = ‘123’ 和 y = ‘456’，那么表达式 x + y 的值为’123456‘，eval(x+y)的值为123456，eval(‘x+y’)的值为’123456’。若x = 123 和 y = 456，那么表达式eval(‘x+y’)的值为579。
18、表达式’abc’.partition(‘a’)的值为(’’, ‘a’, ‘bc’)， ‘a’.join(‘abc’.partition(‘a’)) 的值为’aaabc’。
19、假设re模块已导入，表达式 re.split(’.+’, ‘alpha.beta…gamma…delta’) 的值为[‘alpha’, ‘beta’, ‘gamma’, ‘delta’]。
20、已知 x = ‘a234b123c’，并且re模块已导入，则表达式 re.split(’\d+’, x) 的值为[‘a’, ‘b’, ‘c’]。
21、表达式 ‘’.join(‘asdssfff’.split(‘sd’)) 的值为’assfff’。
22、表达式 ‘’.join(re.split(’[sd]’,‘asdssfff’)) 的值为’afff’。
23、假设re模块已导入，那么表达式 re.findall(’(\d)\1+’, ‘33abcd112’) 的值为[‘3’, ‘1’]。
24、语句 print(re.match(‘abc’, ‘adeabcfg’)) 输出结果为None。
25、语句 print(re.match(’¹+$’,‘abcDEFG000’)) 的输出结果为None。
26、当在字符串前加上小写字母__r__或大写字母_R_表示原始字符串，不对其中的任何字符进行转义。
27、在设计正则表达式时，字符？紧随任何其他限定符(、+、?、{n}、{n,}、{n,m})之后时，匹配模式是“非贪心的”，匹配搜索到的、尽可能短的字符串。
28、假设re模块已导入，那么表达式 re.sub(’\d+’, ‘1’, ‘a12345bb67c890d0e’) 的值为’a1bb1c1d1e’。
29、字符串编码格式UTF8使用3个字节表示一个汉字，1个字节表示一个英语字母，编码格式GBK使用__2__个字节表示一个汉字，1个字节表示一个英语字母。
30、表达式len(‘中国China’)的值为7， len(‘中国China’.encode()) 的值为_11__， len(‘中国China’.encode(‘utf-8’)) 的值为_11_，len(‘中国China’.encode(‘gbk’)) 的值为9。
31、已知 table = ‘’.maketrans(‘abcdew’, ‘xyzstc’)，那么表达式 ‘Hellow world’.translate(table) 的值为’Htlloc corls’。
32、已知字符串 x = ‘hello world’，那么执行语句 x.replace(‘hello’, ‘hi’) 之后，x的值为’hello world’。
33、正则表达式元字符_+用来表示该符号前面的字符或子模式1次或多次出现，元字符用来表示该符号前面的字符或子模式0次或多次出现。
34、已知 x = ‘a b c d’，那么表达式 ‘,’.join(x.split()) 的值为’a,b,c,d’。
35、表达式 ‘abcab’.strip(‘ab’) 的值为’c’。
36、假设math标准库已导入，那么表达式 eval(‘math.sqrt(4)’) 的值为__2.0__。eval(’[1, 2, 3]’) 的值为[1, 2, 3]
37、表达式 ‘abc10’.isalnum() 的值为True， ‘abc10’.isalpha() 的值为False，‘abc10’.isdigit() 的值为False。

isalnum()函数
描述：检测字符串是否由字母和数字组成。
语法：
str.isalpha()
作用：
如果字符串至少有一个字符并且所有字符都是字母则返回 True，否则返回 False。
isdigit()方法
语法：
str.isdigit()
作用：
如果字符串至少有一个字符并且所有字符都是数字则返回 True，否则返回 False。

38、表达式 ‘aaasdf’.lstrip(‘as’) 的值为’df’， ‘aaasdf’.lstrip(‘af’) 的值为’sdf’， ‘aaasdf’.strip(‘af’) 的值为’sd’，‘aaasdf’.rstrip(‘af’) 的值为’aaasd’，‘aaaassddf’.strip(‘afds’)的值为’’。
39、已知formatter = ‘good {0}’.format，那么表达式list(map(formatter, [‘morning’,‘noon’]))的值为[‘good morning’, ‘good noon’]。
40、表达式’:’.join(‘a b c d’.split(maxsplit=2))的值为’a : b:c d’。
41、已知x = ‘hello world’，那么表达式x.replace(‘l’, ‘g’)的值为’heggo worgd’。
42、假设已成功导入Python标准库string，那么表达式len(string.digits)的值为10。
43、已知x = ‘aa b ccc dddd’，那么表达式’’.join([v for i,v in enumerate(x[:-1]) if v= =x[i+1]])的值为’accddd’。
44、假设re模块已导入，那么表达式re.findall(’\d+?’, ‘abcd1234’)的值为[‘1’, ‘2’, ‘3’, ‘4’]，re.findall(’\d+’, ‘abcd1234’)的值为[‘1234’]。
45、假设re模块已导入，那么表达式re.sub(’(.\s)\1+’, ‘\1’,‘a a a a a bb’)的值为’a bb’。
46、表达式 eval(’’.join(map(str, range(1, 6)))) 的值为120。
47、正则表达式模块re的compile()方法用来编译生成正则表达式对象，match()方法用来在字符串开始处进行指定模式的匹配，search()和findall()方法用来在整个字符串中进行指定模式的匹配。
48、表达式 re.search(r’\w?(?P\b\w+\b)\s+(?P=f)\w*?’, ‘Beautiful is is better than ugly.’).group(0) 的值为 ‘is is’。
49、表达式 ‘Beautiful is better than ugly.’.startswith(‘Be’, 5) 的值为False。
50、已知字典 x = {i:str(i+3) for i in range(3)}，那么表达式 ‘’.join(x.values()) 的值为’345’。
二、判断题
1、在UTF-8编码中一个汉字需要占用3个字节。（1）
2、在GBK和CP936编码中一个汉字需要2个字节。（1）
3、在Python中，任意长的字符串都遵守驻留机制。（0）
4、 Python运算符%不仅可以用来求余数，还可以用来格式化字符串。（1）
5、 Python字符串方法replace()对字符串进行原地修改。（0）
6、如果需要连接大量字符串成为一个字符串，那么使用字符串对象的join()方法比运算符+具有更高的效率。（1）
7、正则表达式模块re的match()方法是从字符串的开始匹配特定模式，而search()方法是在整个字符串中寻找模式，这两个方法如果匹配成功则返回match对象，匹配失败则返回空值None。（1）
8、已知x为非空字符串，那么表达式 ‘’.join(x.split()) = = x 的值一定为True。（0）
9、已知x为非空字符串，那么表达式 ‘,’.join(x.split(’,’)) == x 的值一定为True。（1）
10、正则表达式对象的match()方法可以在字符串的指定位置开始进行指定模式的匹配。（1）
11、使用正则表达式对字符串进行分割时，可以指定多个分隔符，而字符串对象的split()方法无法做到这一点。（1）
12、正则表达式元字符“^”一般用来表示从字符串开始处进行匹配，用在一对方括号中的时候则表示反向匹配，不匹配方括号中的字符。（1）
13、正则表达式元字符“\s”用来匹配任意空白字符。（1）
14、正则表达式元字符“\d”用来匹配任意数字字符。（1 ）
15、已知x和y是两个字符串，那么表达式sum((1 for i,j in zip(x,y) if i==j))可以用来计算两个字符串中对应位置字符相等的个数。（1 ）
16、 Python 3.x中字符串对象的encode()方法默认使用utf8作为编码方式。（1）
17、已知x = ‘hellow world.’.encode()，那么表达式x.decode(‘gbk’)的值为’hellow world.’。（1）
18、已知x = ‘Python是一种非常好的编程语言’.encode()，那么表达式x.decode(‘gbk’)的值为’Python是一种非常好的编程语言’。（0）
19、正则表达式’^http’只能匹配所有以’http’开头的字符串。（1）
20、正则表达式’^\d{18}|\d{15}$‘只能检查给定字符串是否为18位或15位数字字符，并不能保证一定是合法的身份证号。（1）
21、正则表达式’[^abc]‘可以一个匹配任意除’a’、‘b’、'c’之外的字符。（ 1）
22、正则表达式’python|perl’或’p(ython|erl)‘都可以匹配’python’或’perl’。（1 ）
三、程序分析题
1、填写下列程序中的正则表达式，使其能得到相应的输出。
import re
st=‘http://www.whpu.edu.cn,http://www.sohu.com,http://www.taobao.com’
pat1=
pat2=
print(re.findall(pat1,st))
print(re.findall(pat2,st))
输出：
[‘http:// www.whpu.edu.cn, http://www.sohu.com, http://www.taobao.com’]
[‘http:// www.whpu.edu.cn’, ‘http://www.sohu.com’, ‘http://www.taobao.com’]

pat1=  r'http://.*'
pat2=  r'http://[\w+\.]+\w+'

2、填写下列程序中的正则表达式，使其能得到相应的输出。
import re
st = ‘a=100 and b=200, sum is 300’
pat1 =
pat2 =
print(re.findall(pat1,st))
print(re.findall(pat2,st))
输出：
[‘100’,’200’,’300’]
[‘a’,’and’,’b’,’sum’,’is’]

pat1 =  r'\d{3}'
pat2 =  r'[a-z]+'

3、填写下列程序中的正则表达式，使其能得到相应的输出。
import re
st = ‘The address of f0/1 is 192.168.1.101 Gateway is 192.168.1.254’
pat =
print(re.findall(pat,st))
输出：
[’192.168.1.101’,’192.168.1.254’]

pat = r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}'

4、读如下程序，写出运行结果。
import re
st = ‘http://www.stiei.edu.cn, http://www.taobao.com, www.stiei.edu.cn’
pat1 = ‘http://.’
pat2 = '[^http://.]’
pat3 = ‘http://[\w+.]+\w+’
pat4 = ‘http://(\w+.)+\w+’
print(re.findall(pat1,st))
print(re.findall(pat2,st))
print(re.findall(pat3,st))
print(re.findall(pat4,st))

['http://www.stiei.edu.cn, http://www.taobao.com, www.stiei.edu.cn']
['w', 'w', 'w', 's', 'i', 'e', 'i', 'e', 'd', 'u', 'c', 'n', ',', ' ', 'w', 'w', 'w', 'a', 'o', 'b', 'a', 'o', 'c', 'o', 'm', ',', ' ', 'w', 'w', 'w', 's', 'i', 'e', 'i', 'e', 'd', 'u', 'c', 'n']  
['http://www.stiei.edu.cn', 'http://www.taobao.com']
['edu.', 'taobao.']

5、利用re模块方法解决问题（写出相应的程序段）：有下面的字符串，提取其中所有的手机号和电子邮件地址：
‘My name is Dean.I come from Wuhan.My Tel is 027-87654321,13912345678 and 18943218765. My Email is dean@163.com and susan@whpu.edu.cn’

6、利用正则表达式分别找出电子邮件地址中的用户名和域名，例如tom@whpu.edu.cn中的用户名（tom）和域名（whpu.edu.cn），写出正则表达式。

import re
s = 'My name is Dean.I come from Wuhan.My Tel is 027-87654321,13912345678 and 18943218765.My Email is dean@163.com and susan@whpu.edu.cn'
patPhone = r'\d{3}\-\d{8}|\d{11}'
patEMail = r'\w+@[\w+\.]+\w+'   
print(re.findall(patPhone, s))
print(re.findall(patEMail, s))

7、读如下程序，写出运行结果。比较结果有什么不同，并说明原因。
st=‘http://www.stiei.edu.cn http://goole.com https://www.google.cn’
pat1 = r’http://(\w+.)*com’
pat2 = r’http://(?:\w+.)*com’
re.findall(pat1,st)
re.findall(pat2,st)

import  re
st='http://www.stiei.edu.cn http://goole.com https://www.google.cn'
pat1 = r'http://(\w+\.)*com'
pat2 = r'http://(?:\w+\.)*com'    
print(re.findall(pat1,st))
print(re.findall(pat2,st))

8、利用正则表达式对象方法解决如下问题（写出相应的程序段）：
（1）将字符串st = 'python.java.windows/linux\nc++'中的单词分离出来。

（2）将字符串st = 'Beautiful is better than ugly.‘中所有的字母’a’‘b’'e’转换为大写。

（3）删除st = ‘tony@stive_for__life.net’中的所有’_’。

（4）将日期字符串st = '2019-10-1’中的年、月、日分别提取出来，并以”2019年10月1日”格式输出。

import re
st1 = 'python.java.windows/linux\nc++'
st2 = 'Beautiful is better than ugly.'
st3 = 'tony@stive_for__life.net'
st4 = '2019-10-1'

pat1= re.compile(r'\w+')
pat2= re.compile(r'a|b|e') 
pat3= re.compile(r'_')
pat4= re.compile(r'(\d{4})-(\d{1,2})-(\d{1,2})')

print(pat1.findall(st1))
print(pat2.sub(lambda x:x.group(0).upper(), st2))
print(pat3.sub('', st3))
matchResult = pat4.search(st4)
if matchResult:
print('{}年{}月{}日'.format(matchResult.group(1),matchResult.group(2),matchResult.group(3)))