2024 Python3.10 系统入门+进阶（七）：字符串及其常用操作详解上篇

Amo Xiang

于 2024-08-23 22:35:56 发布

阅读量379

点赞数 11

分类专栏：流畅的Python 文章标签：开发语言 python

本文链接：https://blog.csdn.net/xw1680/article/details/141442723

版权

流畅的Python 专栏收录该内容

7 篇文章 4 订阅

订阅专栏

一、初始化

示例代码：

s1 = 'string'
s2 = "string2"
s3 = '''this's a "String" '''
s4 = 'hello \n blog.csdn.net'
s5 = r"hello \n blog.csdn.net"
s6 = 'c:\windows\nt'
s7 = R"c:\windows\nt"
s8 = 'c:\windows\\nt'
name = 'tom'; age = 20  # python代码写在一行，使用分号隔开，不推荐
s9 = f'{name}, {age}'  # 3.6支持f前缀
sql = """select * from user where name='amo' """

r 前缀：所有字符都是本来的意思，没有转义
f 前缀：3.6 开始，使用变量插值

二、索引

字符串是序列，支持下标访问。但不可变，不可以修改元素。示例代码：

sql = "select * from user where name='amo'"
print(sql[4])  # 字符串'c'
# sql[4] = 'o'  # 不可以 TypeError: 'str' object does not support item assignment

三、常用操作

3.1 拼接字符串

3.1.1 "+"号

使用 "+" 运算符可完成对多个字符串的拼接，"+" 运算符可以连接多个字符串并产生一个新的字符串对象。示例代码：

mot_en = 'Remembrance is a form of meeting. Forgetfulness is a form of freedom.'
mot_cn = '记忆是一种相遇。遗忘是一种自由。'
print(mot_en + '――' + mot_cn)
str1 = '我今天一共走了'  # 定义字符串
num = 12098  # 定义一个整数
str2 = '步'  # 定义字符串
# 报错: 字符串不允许直接与其他类型的数据拼接，TypeError: can only concatenate str (not "int") to str
# print(str1 + num + str2)  # 对字符串和整数进行拼接
# 正确示例: 先将num转换为字符串，在进行拼接
print(str1 + str(num) + str2)

3.1.2 join()方法

join() 方法用于连接字符串列表。将字符串、元组、列表中的元素以指定的字符(分隔符)连接生成一个新的字符串，例如下图所示：
在这里插入图片描述
join() 方法的语法如下：

In [1]: str.join?
Signature: str.join(self, iterable, /)
Docstring:
Concatenate any number of strings.

The string whose method is called is inserted in between each given string.
The result is returned as a new string.

Example: '.'.join(['ab', 'pq', 'rs']) -> 'ab.pq.rs'
Type:      method_descriptor
参数说明:
str：分隔符，即用什么连接字符串，可以是逗号","、冒号":"、分号";"和斜杠"/"等等，也可以为空
iterable：要连接的可迭代对象。可以是字符串、字符串数组、列表、元组或字典等
注意，join() 方法要求可迭代对象中的所有元素都是字符串类型。如果元素是非字符串类型（如数字），需要先将其转换为字符串。

示例代码：

# 用空格连接字符串中的字符
word = 'Python'
spaced_word = ' '.join(word)
print(spaced_word)  # P y t h o n
# 用逗号连接字符串中的字符
word = 'Hello'
comma_separated = ','.join(word)
print(comma_separated)  # H,e,l,l,o
# 用下划线连接字符串中的字符
word = 'OpenAI'
underscore_separated = '_'.join(word)
print(underscore_separated)  # O_p_e_n_A_I
# 用无分隔符连接字符串中的字符（其实是保持原字符串不变）
word = 'ChatGPT'
no_separator = ''.join(word)
print(no_separator)  # ChatGPT
# print(''.join([1, 2, 3, 4, 5]))  # TypeError: sequence item 0: expected str instance, int found
print(''.join(map(str, [1, 2, 3, 4, 5])))  # 12345
# 连接字符串列表
s = ['四', '川', '大', '学']
print(''.join(s))  # 四川大学
print('-'.join(s))  # 四-川-大-学
print('/'.join(s))  # 四/川/大/学
music = ['辞九门回忆', '会不会', '单身情歌', '错位时空', '红色高跟鞋']
print(music)
print(' '.join(music))
print('\n'.join(music))
print('\t'.join(music))

# 定义字典
my_str = {'四': 1, '川': 2, '大': 3, '学': 4}
print(':'.join(my_str))  # 四:川:大:学

3.2 检索字符串

3.2.1 find() 方法——字符串首次出现的索引位置(rfind()、index()、rindex())

find() 方法实现查询一个字符串在其本身字符串对象中首次出现的索引位置，如起始位置从 11 到结束位置 17 之间子字符串出现的位置，如下图所示。如果没有检索到该字符串，则返回-1。
在这里插入图片描述
find() 方法的语法格式如下：

In [2]: str.find?
Docstring:
S.find(sub[, start[, end]]) -> int

Return the lowest index in S where substring sub is found,
such that sub is contained within S[start:end].  Optional
arguments start and end are interpreted as in slice notation.

Return -1 on failure.
Type:      method_descriptor
参数说明：
str：表示原字符串。
sub：表示要检索的子字符串。
start：可选参数，表示检索范围的起始位置的索引，如果不指定，则从头开始检索。
end ：可选参数，表示检索范围的结束位置的索引，如果不指定，则一直检索到结尾。

例如，子字符串 o 在字符串 www.mingrisoft.com 起始位置从 11 到结束位置 17 之间首次出现的位置，如下图所示：
在这里插入图片描述
说明：Python 的字符串对象还提供了 rfind() 方法，其作用与 find() 方法类似，只是从字符串右边开始查找。Python 的字符串也提供了 index() 方法，它与 find() 方法功能相同，区别在于当 find() 方法没有检索到字符串时会返回 -1，而 index() 方法会抛出 ValueError 异常。

示例代码：

# 1.检索邮箱地址中"@"首次出现中的位置
qq_email = '123456789@qq.com'
print(qq_email.find('@'))  # 9
print(qq_email.find('Q'))  # -1
# 2.提取括号内数据
str1 = '张三（13566688888）'
l1 = str1.find('（')
l2 = str1.find('）')
print(str1[l1 + 1:l2])  # 13566688888
# 3.从邮箱地址提取ID并将首字母大写
email_list = ['gcytom@sohu.com', 'jackeer@qq.com', 'mingrisoft@mingrisoft.com', 'mrkj_2019@qq.com']
for _email in email_list:
    print(_email[:_email.find('@')].capitalize())

# 4.查询字符串中指定字符的全部索引
str_index_list = []  # 保存指定字符的索引


def get_multiple_indexes(string, s):
    str2 = string  # 用于获取字符串总长度
    while True:  # 循环
        if s in string:  # 判断是否存在需要查找的字符
            first_index = string.index(s)  # 获取字符串中第一次出现的字符对应的索引
            string = string[first_index + 1:]  # 将每次找打的字符从字符串中截取掉
            result = len(str2) - len(string)  # 计算截取部分的长度
            str_index_list.append(result - 1)  # 长度减1就是字符所在的当前索引，将索引添加列表中
        else:
            break  # 如果字符串中没有需要查找的字符就跳出循环
    print(str_index_list)  # 打印指定字符出现在字符串中的全部索引


# [0, 1, 2, 8]
get_multiple_indexes("aaabbdddabb", 'a')  # 调用自定义方法，获取字符串中指定字符的全部索引

# 其他方法示例
s1 = 'www.mingrisoft.com'
substr = 'm'
print(s1.rfind(substr))  # 17
print(s1.rfind(substr, 0, 5))  # 4
print(s1.rfind(substr, 5, 0))  # -1
str1 = '790129881@qq.com'
print('@符号首次出现的位置为：', str1.index('@'))  # 9
s2 = 'www.mingrisoft.com'
substr = 'm'
print(s2.rindex(substr))  # 17
print(s2.rindex(substr, 0, 5))  # 4
# print(s2.rindex(substr, 5, 0))  # ValueError: substring not found

rfind() 方法返回子字符串在字符串中最后一次出现的位置（从右向左查询），如果没有匹配项则返回-1。rindex() 方法的作用与 index() 方法类似。rindex() 方法用于查询子字符串在字符串中最后出现的位置，如果没有匹配的字符串会报异常。另外，还可以指定开始位置和结束位置来设置查找的区间。这两个方法效率真不高，都是在字符串中遍历搜索，但是如果找子串工作必不可少，那么必须这么做，但是能少做就少做。

3.2.2 count() 方法——统计字符串出现次数

count() 方法用于统计字符串中某个字符出现的次数，如起始位置从 11 到结束位置 17 之间字符出现的次数，如下图所示：
在这里插入图片描述
count() 方法的语法格式如下：

In [3]: str.count?
Docstring:
S.count(sub[, start[, end]]) -> int

Return the number of non-overlapping occurrences of substring sub in
string S[start:end].  Optional arguments start and end are
interpreted as in slice notation.
Type:      method_descriptor
参数说明：
S：表示原字符串。
sub：表示要检索的子字符串。
start：可选参数，表示检索范围的起始位置的索引，默认为第一个字符，索引值为 0，可单独指定。
end：可选参数，表示检索范围的结束位置的索引，默认为字符串的最后一个位置，不可以单独指定。

例如，子字符串 o 在字符串 www.mingrisoft.com 起始位置从 11 到结束位置 17 之间中出现的次数，如下图所示：
在这里插入图片描述
注意：这里要注意一点，结束位置为17，但是统计字符个数时不包含17这个位置上的字符。例如结束位置为 16，那么o出现的次数为1。

示例代码：

import string

# 获取关键词"人民"在字符串中出现的次数
str1 = ('五四运动，爆发于民族危难之际，是一场以先进青年知识分子为先锋、'
        '广大人民群众参加的彻底反帝反封建的伟大爱国革命运动，是一场中国人民为拯救民族危亡、'
        '捍卫民族尊严、凝聚民族力量而掀起的伟大社会革命运动，是一场传播新思想新文化新知识的伟大思想启蒙运动和新文化运动'
        '，以磅礴之力鼓动了中国人民和中华民族实现民族复兴的志向和信心。')

print('这句话中"人民"共出现：', str1.count('人民'), '次')
# 统计关键词在字符串中不同位置处出现的次数
cn = '没什么是你能做却办不到的事。'
en = "There's nothing you can do that can't be done."
print(cn)
print('原字符串：', en)
# 字母"o"在不同位置处出现的次数
print(en.count('o', 0, 17))  # 1
print(en.count('o', 0, 27))  # 3
print(en.count('o', 0, 47))  # 4

count = 0
test_str = "https://blog.csdn.net/xw1680%$&,*,@!"
# 将输入的字符串创建一个新字典
c = {}.fromkeys(test_str, 0)  # 将字符串中的每个字符作为key，对应的值为0
for key, value in c.items():
    if key in string.punctuation:  # 统计标点符号
        count = test_str.count(key) + count
# 字符串中包含： 14 个标点符号
print('字符串中包含：', count, '个标点符号')

# 统计文本中数字出现的个数
f = open('./digits.txt', 'r', encoding='utf-8')
chars = f.read()
count = 0
# 将输入的字符串创建一个新字典
c = {}.fromkeys(chars, 0)
for key, value in c.items():
    if key in string.digits:  # 统计数字
        count = chars.count(key) + count
print('文本中包含：', count, '个数字')  # 14

find、index 和 count 方法都是 O(n)，随着字符串数据规模的增大，而效率下降。

3.2.3 len() 函数——计算字符串长度或元素个数

len() 函数的主要功能是获取一个(字符、列表、元组等)可迭代对象的长度或项目个数。其语法格式如下：

In [4]: len?
Signature: len(obj, /)
Docstring: Return the number of items in a container.
Type:      builtin_function_or_method
参数说明：
参数obj：要获取其长度或者项目个数的对象。如字符串、元组、列表、字典等；
返回值：对象长度或项目个数。

示例代码：

# 1.获取字符串长度
# 字符串中每个符号仅占用一个位置，所以该字符串长度为34
str1 = '今天会很残酷，明天会更残酷，后天会很美好，但大部分人会死在明天晚上。'
# 在获取字符串长度时，空格也需要占用一个位置，所以该字符串长度为11
str2 = 'hello world'
print('str1字符串的长度为:', len(str1))  # 打印str1字符串长度 34
print('str2字符串的长度为', len(str2))  # 打印str2字符串长度 11
# 打印str2字符串去除空格后的长度
print('str2字符串去除空格后的长度为：', len(str2.replace(' ', '')))  # 10


# 2.计算字符串的字节长度
def byte_size(string):
    return len(string.encode('utf-8'))  # 使用encode()函数设置编码格式


print(byte_size('Hello World'))  # 11
print(byte_size('人生苦短，我用Python'))  # 27
"""
说明：在utf-8编码格式下，一个中文占3个字节。
"""

3.3 分割

3.3.1 split() 方法——分割字符串

split() 方法可以把一个字符串按照指定的分隔符切分为字符串列表，例如下图所示的效果。该列表的元素中，不包括分隔符。
在这里插入图片描述
split() 方法的语法格式如下：

In [6]: str.split?
Signature: str.split(self, /, sep=None, maxsplit=-1)
Docstring:
Return a list of the substrings in the string, using sep as the separator string.

  sep
    The separator used to split the string.

    When set to None (the default value), will split on any whitespace
    character (including \\n \\r \\t \\f and spaces) and will discard
    empty strings from the result.
  maxsplit
    Maximum number of splits (starting from the left).
    -1 (the default value) means no limit.

Note, str.split() is mainly useful for data that has been intentionally
delimited.  With natural text that includes punctuation, consider using
the regular expression module.
Type:      method_descriptor
参数说明：
str：表示要进行分割的字符串。
sep：用于指定分隔符，可以包含多个字符，默认为 None，即所有空字符(包括空格、换行"\n"、制表符"\t"等)。
maxsplit：可选参数，用于指定分割的次数，如果不指定或者为-1，则分割次数没有限制，否则返回结果列表的元素个数，
个数最多为 maxsplit+1。
返回值：分隔后的字符串列表。
在使用split()方法时，如果不指定参数，默认采用空白符进行分割，这时无论有几个空格或者空白符都将作为一个分隔符进行分割。

示例代码：

# 1.根据不同的分隔符分割字符串
str1 = 'www.baidu.com'
list1 = str1.split()  # 采用默认分隔符进行分割
list2 = str1.split('.')  # 采用.号进行分割
list3 = str1.split(' ', 1)  # 采用空格进行分割，并且只分割第1个
print(list1)  # ['www.baidu.com']
print(list2, list2[1])  # ['www', 'baidu', 'com'] baidu
print(list3)  # ['www.baidu.com']
# 2.删除字符串中连续多个空格而保留一个空格
line = '吉林省     长春市     二道区     东方广场中意之尊888'
# 吉林省 长春市 二道区 东方广场中意之尊888
print(' '.join(line.split()))

s = ','.join('abcd')  # 'a,b,c,d'
print(s.split(','))  # ['a', 'b', 'c', 'd']
print(s.split())  # ['a,b,c,d']
print(s.split(',', 2))  # ['a', 'b', 'c,d']
s1 = '\na b  \tc\nd\n'  # 注意下面3个切割的区别print(s1.split())
s2 = ',a,b,c,d,'
print(s1.split())  # ['a', 'b', 'c', 'd']
print(s2.split(','))  # ['', 'a', 'b', 'c', 'd', '']
print(s1.split(' '))  # ['\na', 'b', '', '\tc\nd\n']
print(s1.split('\n'))  # ['', 'a b  \tc', 'd', '']
print(s1.split('b'))  # ['\na ', '  \tc\nd\n']
print(s1.splitlines())  # ['', 'a b  \tc', 'd']

rsplit() 也是 Python 中用于分割字符串的一个方法，与 split() 类似，但它从字符串的右侧开始分割。它主要用于从右侧开始分割字符串，并返回一个列表。对比 split，示例代码如下：

# 1.不指定 maxsplit 参数
text = "apple, banana, cherry, date"
split_result = text.split(", ")
rsplit_result = text.rsplit(", ")
print("split:", split_result)  # split: ['apple', 'banana', 'cherry', 'date']
print("rsplit:", rsplit_result)  # split: ['apple', 'banana', 'cherry', 'date']
# 2.指定 maxsplit=2
text = "apple, banana, cherry, date"
split_result = text.split(", ", 2)
rsplit_result = text.rsplit(", ", 2)
print("split:", split_result)  # split: ['apple', 'banana', 'cherry, date']
print("rsplit:", rsplit_result)  # rsplit: ['apple, banana', 'cherry', 'date']
# 3.处理末尾带有空白符的字符串
text = "a b c d "
split_result = text.split(" ", 2)
rsplit_result = text.rsplit(" ", 2)

print("split:", split_result)  # split: ['a', 'b', 'c d ']
print("rsplit:", rsplit_result)  # rsplit: ['a b c', 'd', '']
# 4.多个空格作为分隔符
text = "  one   two   three  "
split_result = text.split()
rsplit_result = text.rsplit()

# ps:当字符串两端存在空格且不指定 sep 参数时，split() 和 rsplit() 的行为相同，都会忽略两端的空格。
print("split:", split_result)  # split: ['one', 'two', 'three']
print("rsplit:", rsplit_result)  # rsplit: ['one', 'two', 'three']

3.3.2 splitlines()方法——返回是否包含换行符的列表

splitlines() 方法用于按照换行符 \r,\r\n,\n 分隔，返回一个是否包含换行符的列表，如果参数 keepends 为 False，则不包含换行符，如果为 True，则包含换行符。splitlines() 方法的语法格式如下：

In [11]: str.splitlines?
Signature: str.splitlines(self, /, keepends=False)
Docstring:
Return a list of the lines in the string, breaking at line boundaries.

Line breaks are not included in the resulting list unless keepends is given and
true.
Type:      method_descriptor
参数说明:
keepends：输出结果是否包含换行符 (\r,\r\n,\n)，默认值为False，不包含换行符，如果为True，则保留换行符
返回值：返回一个包含换行符的列表

示例代码：

str1 = 'Amo\r\nPaul\r\nJerry'
s1 = '\na b  \tc\nd\n'
list1 = str1.splitlines()  # 不带换行符的列表
print(list1)  # ['Amo', 'Paul', 'Jerry']
print(list1[0], list1[1], list1[2])  # Amo Paul Jerry
list2 = str1.splitlines(True)  # 带换行符的列表
print(list2)  # ['Amo\r\n', 'Paul\r\n', 'Jerry']
print(list2[0], list2[1], list2[2], sep='')  # 使用sep去掉空格
print(s1.splitlines())  # ['', 'a b  \tc', 'd']

# 一些复杂一点的案例
# 1.字符串包含多种换行符（\n, \r\n, \r）
text = '  Line 1\nLine 2\r\nLine 3\rLine 4  '
lines = text.splitlines()
print(lines)  # ['  Line 1', 'Line 2', 'Line 3', 'Line 4  ']
# 2.包含空行和多种空白符（如制表符和空格）
text = '\tLine 1\n\n \r\nLine 2\t\n\r\tLine 3  '
lines = text.splitlines()  # ['\tLine 1', '', ' ', 'Line 2\t', '', '\tLine 3  ']
print(lines)
# 3.保留换行符的情况下分割
text = '  Line 1\n\tLine 2\r\n \tLine 3\rLine 4  '
lines = text.splitlines(keepends=True)
print(lines)  # ['  Line 1\n', '\tLine 2\r\n', ' \tLine 3\r', 'Line 4  ']
# 4.处理两端有空白符的字符串
text = '\n\n  \t  Line 1\n  Line 2  \n  \n'
lines = text.splitlines()
print(lines)  # ['', '', '  \t  Line 1', '  Line 2  ', '  ']
# 5.仅包含空白符的字符串
text = '   \t  '
lines = text.splitlines()
print(lines)  # ['   \t  ']

3.3.2 partition()方法——分割字符串为元组

partition() 方法根据指定的分隔符将字符串进行分割。如果字符串中包含指定的分隔符，则返回一个3元的元组，第一个为分隔符左边的子字符串，第二个为分隔符本身，第三个为分隔符右边的子字符串，如下图所示：
在这里插入图片描述
partition() 方法的语法格式如下：

In [12]: str.partition?
Signature: str.partition(self, sep, /)
Docstring:
Partition the string into three parts using the given separator.

This will search for the separator in the string.  If the separator is found,
returns a 3-tuple containing the part before the separator, the separator
itself, and the part after it.

If the separator is not found, returns a 3-tuple containing the original string
and two empty strings.
Type:      method_descriptor

示例代码：

python_url = 'https://blog.csdn.net/xw1680?'  # 定义字符串
t1 = python_url.partition('.')  # 以"."分割
# ('https://blog', '.', 'csdn.net/xw1680?')
print(t1)
s = ','.join('abcd')  # 'a,b,c,d'
print(s.partition(','))  # ('a', ',', 'b,c,d')
print(s.partition('.'))  # ('a,b,c,d', '', '')
print(s.rpartition(','))  # ('a,b,c', ',', 'd')
print(s.rpartition('.'))  # ('', '', 'a,b,c,d')

rpartition() 方法与 partition() 方法基本一样，细微区别在于 rpartition() 方法是从目标字符串的末尾也就是右边开始搜索分割符。

3.4 替换 replace() 方法——替换字符串

replace() 方法用于将某一字符串中一部分字符替换为指定的新字符，如果不指定新字符，那么原字符将被直接去除，例如图1和图2所示的效果。
在这里插入图片描述
replace() 方法的语法格式如下：

In [13]: str.replace?
Signature: str.replace(self, old, new, count=-1, /)
Docstring:
Return a copy with all occurrences of substring old replaced by new.

  count
    Maximum number of occurrences to replace.
    -1 (the default value) means replace all occurrences.

If the optional argument count is given, only the first count occurrences are
replaced.
Type:      method_descriptor
参数说明：
str: 要替换的字符串
old: 将被替换的子字符串。
new: 字符串，用于替换old子字符串。
count: 可选参数，表示要替换的次数，如果不指定该参数则替换所有匹配字符，而指定替换次数时的替换顺序是从左向右依次替换。

示例代码：

str1 = 'www.baidu.com'
# www.douban.com
print(str1.replace('baidu', 'douban'))
str1 = '333333201501012222'
s1 = str1[6:14]
# 333333********2222
print(str1.replace(s1, '********'))

s = ','.join('abcd')  # 'a,b,c,d'
print(s.replace(',', ' '))  # 'a b c d'
print(s.replace(',', ' ', 2))  # 'a b c,d'
s1 = 'www.baidu.com'
print(s1.replace('w', 'a'))  # 'aaa.baidu.com'
print(s1.replace('ww', 'a'))  # 'aw.baidu.com'
print(s1.replace('ww', 'w'))  # 'ww.baidu.com'

3.5 移除 strip() 方法——去除字符串头尾特殊字符

strip() 方法用于移除字符串左右两边的空格和特殊字符，例如下图所示的效果：
在这里插入图片描述
strip() 方法的语法格式如下：

 In [15]: str.strip?
Signature: str.strip(self, chars=None, /)
Docstring:
Return a copy of the string with leading and trailing whitespace removed.

If chars is given and not None, remove characters in chars instead.
Type:      method_descriptor
参数说明：
str：原字符串。
chars：为可选参数，用于指定要去除的字符，可以指定多个。
例如，设置chars为 `*`，则去除左、右两侧包括的`*`。
如果不指定 chars 参数，默认将去除字符串左右两边的空格、制表符 `\t`、回车符 `\r`、换行符 `\n` 等。

示例代码：

# 去除字符串中的空格
s1 = ' hello world '
print(s1.strip(' '))  # 去除字符串中头尾的空格

s2 = ('               add_sheet会返回一个Worksheet类。创建的时候有可选参数cell_overwrite_ok，'
      '表示是否可以覆盖单元格，其实是Worksheet实例化的一个参数，默认值是False。去除字符串中的特殊字符\t\n\r')
print(s2.strip(' \t\n\r'))  # 去除字符串中头尾的空格
s3 = ' hello world*'
print(s3.strip('*'))  # 指定参数后，会按照指定参数去除，空格失效

# 删除两边的特殊符号和空格
str1 = '-+-+-《黑神话：悟空》+--+- '
# 删除两边 - + 和空格
print(str1.strip().strip('-+'))
# 删除含有指定字符的所有字符
s4 = '八百标兵 八百标兵奔北坡,炮兵并排北边跑 炮兵怕把标兵碰,标兵怕碰炮兵炮'
print(s4.strip('百八炮兵'))
# 解析:
# 1.首先从头开始：第一个字符是"八"，包含在"百八炮兵"中，删掉。
# 第二个字符是"百"包含在"百八炮兵"中，删掉。以此类推，一直到字符不包含在"百八炮兵"中，停止删除。
# 2.再从尾开始：第一个字符是"炮"，包含在"百八炮兵"中，删除。第二个字符是"兵"，包含在"百八炮兵"中，
# 删除。以此类推，直到字符不包含在"百八炮兵"中，停止删除。
s = '\t\r\na b  c,d\ne\n\t'
print(s.strip())
print(s.strip('\t\n'))
print(s.strip('\t\ne\r'))

3.6 判断

3.6.1 startswith()方法——是否以指定的子字符串开头

startswith() 方法用于检索字符串是否以指定子字符串开头。如果是则返回 True，否则返回 False。startswith() 方法的语法格式如下：

In [16]: str.startswith?
Docstring:
S.startswith(prefix[, start[, end]]) -> bool

Return True if S starts with the specified prefix, False otherwise.
With optional start, test S beginning at that position.
With optional end, stop comparing S at that position.
prefix can also be a tuple of strings to try.
Type:      method_descriptor
参数说明：
1.S: 表示原字符串。
2.prefix: 表示要检索的子字符串。
3.start: 可选参数，表示检索范围的起始位置的索引，如果不指定，则从头开始检索。
4.end: 可选参数，表示检索范围的结束位置的索引，如果不指定，则一直检索到结尾。

示例代码：

s = "www.baidu.com"
print(s.startswith('ww'))  # True
print(s.startswith('d', 7))  # True
print(s.startswith('d', 10))  # False
print(s.startswith('com', 11))  # False
print(s.startswith(''))  # True
print(s.startswith(()))  # False

# 判断手机号是否以"130"、"131"、"186"等联通手机号段开头
while 1:
    # 创建元组
    tuples = ('130', '131', '132', '155', '156', '175', '176', '185', '186', '166')
    str1 = input('请输入联通手机号：')
    # 判断手机号是否以指定的字符开头
    # 如果传入的 prefix 是一个可迭代对象（如元组），则 startswith() 会检查字符串是否以该元组中的任意一个元素开头。
    my_val = str1.startswith(tuples, 0)  # ?还能这么用
    if my_val:
        print(str1.startswith(tuples, 0))
        break
    else:
        print('您输入的不是联通手机号，请重新输入！')

# 分类数据筛选之筛选列车车次
# 列车车次数据
train = ['D74', 'D20', 'G240', 'D102', 'Z158', 'G394', 'K1304', 'D30',
         'D22', 'G384', 'G382', 'D4606', 'K350', 'K340',
         'Z62', 'Z64', 'K266', 'Z118']
f_letter = input('请输入车次首字母：')
for t in train:
    if t.startswith(f_letter):  # 判断车次是否以指定的字符开头
        print(t)

3.6.2 endswith()方法——是否以指定子字符串结尾

endswith() 方法用于检索字符串是否以指定子字符串结尾。如果是则返回 True，否则返回 False。endswith() 方法的语法格式如下：

In [21]: str.endswith?
Docstring:
S.endswith(suffix[, start[, end]]) -> bool

Return True if S ends with the specified suffix, False otherwise.
With optional start, test S beginning at that position.
With optional end, stop comparing S at that position.
suffix can also be a tuple of strings to try.
Type:      method_descriptor
参数说明:
1.S: 表示原字符串。
2.suffix: 表示要检索的子字符串。
3.start: 可选参数，表示检索范围的起始位置的索引，如果不指定，则从头开始检索。
4.end: 可选参数，表示检索范围的结束位置的索引，如果不指定，则一直检索到结尾。

ps:
1.如果传入的 suffix 是一个元组，endswith() 会检查字符串是否以元组中的任意一个元素作为后缀。
2.如果没有匹配的后缀，返回 False。
3.如果 suffix 传入一个空元组或空字符串(' '，而不是'')，endswith() 会始终返回 False。

示例代码：

# 传入字符串作为后缀
text = "example.txt"
result = text.endswith(".txt")
print(result)  # True
# 传入元组作为后缀集合
text = "example.txt"
result = text.endswith((".txt", ".doc", ".pdf"))
# 这里 endswith() 检查 text 是否以元组中的任意一个后缀结束。由于 text 以 ".txt" 结尾，所以返回 True。
print(result)  # True
# 指定结束位置
text = "example.txt"
result = text.endswith("example", 0, 7)
print(result)  # True
# 传入包含多个后缀的元组，指定范围
text = "document.pdf"
result = text.endswith((".txt", ".doc", ".pdf"), 0, 11)
print(result)  # False
print(text.endswith(' '))  # False
print(text.endswith(''))  # True
print(text.endswith(()))  # False

3.6.3 isalnum()方法——判断字符串是否由字母和数字组成

isalnum() 方法用于判断字符串是否由字母和数字组成。isalnum() 方法的语法格式如下：

In [22]: str.isalnum?
Signature: str.isalnum(self, /)
Docstring:
Return True if the string is an alpha-numeric string, False otherwise.

A string is alpha-numeric if all characters in the string are alpha-numeric and
there is at least one character in the string.
Type:      method_descriptor
如果字符串中至少有一个字符并且所有字符都是字母(a-z, A-Z)或数字(0-9)则返回True，否则返回False。

示例代码：

s1 = 'amo 123'
s2 = s1.replace(' ', '')  # 滤除空格
r1 = s1.isalnum()
r2 = s2.isalnum()
print('原字符串：', s1)
print('滤除空格后：', s2)
print(r1)  # False
print(r2)  # True

3.6.4 isalpha()方法——判断字符串是否只由字母组成

isalpha() 方法用于判断字符串是否只由字母组成。isalpha() 方法的格式如下：

In [23]: str.isalpha?
Signature: str.isalpha(self, /)
Docstring:
Return True if the string is an alphabetic string, False otherwise.

A string is alphabetic if all characters in the string are alphabetic and there
is at least one character in the string.
Type:      method_descriptor
说明: 
1.如果字符串中至少有一个字符并且所有字符都是字母则返回True，否则返回False
2.字母字符包括所有 Unicode 字母字符，因此支持多种语言的字符。

示例代码：

text = "HelloWorld"
result = text.isalpha()
print(result)
text = "Hello World"
result = text.isalpha()
print(result)
text = "Hello123"
result = text.isalpha()
print(result)
text = "Hello@World"
result = text.isalpha()
print(result)
text = "Привет"
result = text.isalpha()
print(result)  # True isalpha() 支持所有语言的字母字符，因此即使是非拉丁字母的字符串，也能返回 True。
text = ""
result = text.isalpha()
print(result)

3.6.5 isdecimal()方法——判断字符串是否只包含十进制字符

isdecimal() 方法用于检查字符串是否只包含十进制字符。这种方法只适用于 unicode 对象。注意：定义一个十进制字符串，只要在字符串前添加 'u' 前缀即可。isdecimal() 方法的语法格式如下：

In [24]: str.isdecimal?
Signature: str.isdecimal(self, /)
Docstring:
Return True if the string is a decimal string, False otherwise.

A string is a decimal string if all characters in the string are decimal and
there is at least one character in the string.
Type:      method_descriptor
如果字符串只包含数字则返回True，否则返回False。

示例代码：

text = "12345"
result = text.isdecimal()
print(result)
text = "123a45"
result = text.isdecimal()
print(result)
text = "123-45"
result = text.isdecimal()
print(result)
text = "１２３４５"  # 全角数字字符
result = text.isdecimal()
print(result)
text = "²³¼"
result = text.isdecimal()
print(result)
text = ""
result = text.isdecimal()
print(result)
"""
isdecimal() 仅对包含十进制数字字符的字符串返回 True，例如阿拉伯数字和全角数字字符。
它不会识别上标、下标、分数或罗马数字等作为十进制数字字符。
如果字符串为空，isdecimal() 始终返回 False。
"""
s1 = u'amo12468'
print(s1.isdecimal())
s2 = u'12468'
print(s2.isdecimal())

3.6.6 isdigit()方法——判断字符串是否只由数字组成

isdigit() 方法用于判断字符串是否只由数字组成。isdigit() 方法的语法格式如下：

In [25]: str.isdigit?
Signature: str.isdigit(self, /)
Docstring:
Return True if the string is a digit string, False otherwise.

A string is a digit string if all characters in the string are digits and there
is at least one character in the string.
Type:      method_descriptor
如果字符串只包含数字则返回True，否则返回False

示例代码：

while True:
    str1 = input('请输入数字：')
    # 使用isdigit()方法判断是否为全数字
    my_val = str1.isdigit()
    if my_val:
        strint = int(str1)  # 将数字转换为整型
        print(strint)  # 输出
        print(type(strint))  # 判断类型
        break
    else:
        print('不是数字，请重新输入！')

3.6.7 isidentifier()方法——判断字符串是否为合法的Python标识符或变量名

isidentifier() 方法用于判断字符串是否是有效的 Python 标识符，还可以用来判断变量名是否合法。isidentifier() 方法的语法格式如下：

In [26]: str.isidentifier?
Signature: str.isidentifier(self, /)
Docstring:
Return True if the string is a valid Python identifier, False otherwise.

Call keyword.iskeyword(s) to test whether string s is a reserved identifier,
such as "def" or "class".
Type:      method_descriptor
如果字符串是有效的Python标识符返回True，否则返回False。
isidentifier() 仅检查字符串是否可以作为标识符使用，不会检查字符串是否为保留关键字

示例代码：

print('if'.isidentifier())
print('break'.isidentifier())
print('class'.isidentifier())
print('def'.isidentifier())
print('while'.isidentifier())
print('_b'.isidentifier())
print('云从科技888m'.isidentifier())
print('886'.isidentifier())
print('8a'.isidentifier())
print(''.isidentifier())
print('A2345@'.isidentifier())

3.6.8 islower()方法——判断字符串是否全由小写字母组成

islower() 方法用于判断字符串是否由小写字母组成。islower() 方法的语法格式如下：

In [27]: str.islower?
Signature: str.islower(self, /)
Docstring:
Return True if the string is a lowercase string, False otherwise.

A string is lowercase if all cased characters in the string are lowercase and
there is at least one cased character in the string.
Type:      method_descriptor
如果字符串中包含至少一个区分大小写的字符，并且所有这些字符都是小写，则返回True，否则返回False。

示例代码：

s1 = 'hello world'
print(s1.islower())
s2 = 'Hello World'
print(s2.islower())

3.6.9 isnumeric()方法——判断字符串是否只由数字（支持罗马数字、汉字数字等）组成

isnumeric() 方法用于判断字符串是否只由数字组成。这种方法是只针对 unicode 对象。注意：定义一个字符串为 Unicode，只要在字符串前添加 'u' 前缀即可。isnumeric() 方法的语法格式如下：

In [28]: str.isnumeric?
Signature: str.isnumeric(self, /)
Docstring:
Return True if the string is a numeric string, False otherwise.

A string is numeric if all characters in the string are numeric and there is at
least one character in the string.
Type:      method_descriptor
如果字符串只由数字组成，则返回True，否则返回False

示例代码：

s1 = u'amo12468'  # 可以不用加u前缀 python3默认就是unicode编码
print(s1.isnumeric())
s1 = u'12468'
print(s1.isnumeric())
s1 = u'ⅠⅡⅣⅦⅨ'
print(s1.isnumeric())
s1 = u'㈠㈡㈣㈥㈧'
print(s1.isnumeric())
s1 = u'①②④⑥⑧'
print(s1.isnumeric())
s1 = u'⑴⑵⑷⑹⑻'
print(s1.isnumeric())
s1 = u'⒈⒉⒋⒍⒏'
print(s1.isnumeric())
s1 = u'壹贰肆陆捌uuu'
print(s1.isnumeric())

3.6.10 isprintable()方法——判断字符是否为可打印字符

isprintable() 方法用于判断字符串中所有字符是否都是可打印字符或字符串为空。Unicode 字符集中 Other、Separator 类别的字符是不可打印的字符（但不包括 ASCII 码中的空格（0x20））。isprintable() 方法可用于判断转义字符。isprintable() 方法的语法格式如下：

In [29]: str.isprintable?
Signature: str.isprintable(self, /)
Docstring:
Return True if the string is printable, False otherwise.

A string is printable if all of its characters are considered printable in
repr() or if it is empty.
Type:      method_descriptor
如果字符串中的所有字符都是可打印的字符或字符串为空则返回True，否则返回False

说明：ASCII 码中第 0～32 号及第 127 号是控制字符；第 33～126 号是可打印字符，其中第 48～57 号为 0～9 十个阿拉伯数字；65～90 号为26个大写英文字母，97～122 号为26个小写英文字母。示例代码：

s1 = '\n\t'
print(s1.isprintable())
s1 = 'amo_cx'
print(s1.isprintable())
s1 = '12345'
print(s1.isprintable())
s1 = '蜘蛛侠'
print(s1.isprintable())

3.6.11 isspace()方法——判断字符串是否只由空格组成

isspace() 方法用于判断字符串是否只由空格组成。isspace() 方法的语法格式如下：

In [30]: str.isspace?
Signature: str.isspace(self, /)
Docstring:
Return True if the string is a whitespace string, False otherwise.

A string is whitespace if all characters in the string are whitespace and there
is at least one character in the string.
Type:      method_descriptor
如果字符串中只包含空格，则返回True，否则返回False

示例代码：

s1 = '\n  \r\n   \t  \f'
print(s1.isspace())
s1 = 'amo xw 1996'
print(s1.isspace())

3.6.12 istitle()方法——判断首字母是否大写其他字母小写

istitle() 方法用于判断字符串中所有的单词首字母是否为大写而其他字母为小写。istitle() 方法的语法格式如下：

In [32]: str.istitle?
Signature: str.istitle(self, /)
Docstring:
Return True if the string is a title-cased string, False otherwise.

In a title-cased string, upper- and title-case characters may only
follow uncased characters and lowercase characters only cased ones.
Type:      method_descriptor
如果字符串中所有的单词首字母为大写而其他字母为小写则返回True，否则返回False
istitle() 方法只检查字母部分是否符合标题样式，非字母字符（如标点符号、空格）不会影响结果

示例代码：

text = "This Is A Title"
result = text.istitle()
print(result)
text = "This is a Title"
result = text.istitle()
print(result)
text = "THIS IS A TITLE"
result = text.istitle()
print(result)
text = "this Is A Title"
result = text.istitle()
print(result)
text = "Hello, World!"
# 即使字符串中包含非字母字符，istitle() 仍然判断字母部分是否符合标题样式。
result = text.istitle()
print(result)
text = "Python"
result = text.istitle()
print(result)

3.6.13 isupper()方法——判断字符串是否全由大写字母组成

isupper() 方法用于判断字符串中所有的字母是否都是大写。isupper() 方法的语法格式如下：

In [33]: str.isupper?
Signature: str.isupper(self, /)
Docstring:
Return True if the string is an uppercase string, False otherwise.

A string is uppercase if all cased characters in the string are uppercase and
there is at least one cased character in the string.
Type:      method_descriptor
如果字符串中包含至少一个区分大小写的字符，并且所有这些字符都是大写，则返回True，否则返回False
isupper() 方法仅检查字母部分是否都是大写字母，非字母字符（如数字、符号、空格）不会影响结果。

示例代码：

# 全部字母大写的字符串
text = "HELLO WORLD"
result = text.isupper()
print(result)  # 输出: True

# 包含小写字母的字符串
text = "Hello World"
result = text.isupper()
print(result)  # 输出: False

# 包含数字的字符串
text = "123ABC"
result = text.isupper()
print(result)  # 输出: True

# 包含非字母字符的字符串
text = "HELLO, WORLD!"
result = text.isupper()
print(result)  # 输出: True

# 所有字母都是大写，但字符串包含空格
text = "HELLO WORLD"
result = text.isupper()
print(result)  # 输出: True

# 包含小写字母和大写字母的混合
text = "HELLO world"
result = text.isupper()
print(result)  # 输出: False

# 空字符串
text = ""
result = text.isupper()
print(result)  # 输出: False

Amo Xiang

关注

11
点赞
踩
20

收藏

觉得还不错? 一键收藏
打赏
0
评论
2024 Python3.10 系统入门+进阶（七）：字符串及其常用操作详解上篇

本文详细介绍了Python中的字符串操作，包括定义、转义字符、原生字符串、Unicode和字符编码，以及字节串的编码解码。此外，还讲解了字符串的常见操作，如连接、替换、分割、大小写转换、修剪和填充等，以及字符串的检索、替换方法和大小写转换。文章通过实例展示了Python字符串处理的各种方法和技巧。
复制链接

扫一扫