Python 字符串及方法汇总

最新推荐文章于 2024-01-12 17:48:18 发布

Y.Yooin

最新推荐文章于 2024-01-12 17:48:18 发布

阅读量120

点赞数

文章标签： python

本文链接：https://blog.csdn.net/qq_44955505/article/details/133157359

版权

Python 字符串

capitalize() 方法返回字符串的副本，其中第一个字符大写，其余字符小写。这个方法没有接受任何参数

center() 用于将字符串居中并用指定字符填充所有空白位置。
语法如下：
string.center(width[, fillchar])
其中，参数width表示要求的字符串宽度，可以是偶数或奇数；参数fillchar表示填充字符，默认为空格。
Eg:

#定义一个字符串：
myString = "Hello, World!"
#计算字符串长度：
stringLength = len(myString)
#要求的宽度为20：
width = 20
#使用*填充字符串：
fillChar = "*"
#居中并用*填充：
newString = myString.center(width, fillChar)
#输出新的字符串：
print("\n" + newString)
#居中并用空格填充：
newString = myString.center(width)
#输出新的字符串：
print("\n" + newString)
输出：
********Hello, World!*******
     Hello, World!

count() 返回在字符串中某个字符或子字符串出现的次数，可以用来统计字符串中某个字符或子字符串的出现次数。
语法
string.count(substring, start, end)
该方法接收三个参数：
substring：要搜索的字符串或字符。
start：可选参数，设置从哪个索引位置开始搜索字符串，默认是 0。
end：可选参数，设置搜索字符串的结尾索引位置，默认是字符串的长度。
#字符串
string = “I lOve Python!”
print(string.count(‘o’)) #输出 1
print(string.count(‘O’)) #输出 1
print(string.lower().count(‘o’)) #输出 2
print(string.upper().count(‘O’)) #输出 2

decode() 函数是字符串对象的方法，可以将bytes类型的字符串解码成所需的字符串形式。字符串解码时需要指定字符编码方式。decode()方法的语法如下：
str.decode(encoding=‘UTF-8’,errors=‘strict’)
参数encoding是可选的。该参数指定字符串的编码方式，默认是最常用的UTF-8。
errors参数也是可选的，该参数定义了如何处理解码中的错误。默认的strict方式表示如果出现编码错误，将抛出UnicodeError异常。可以使用其他方式来处理解码错误，例如：
ignore: 如果出现编码错误，忽略错误。
replace: 如果出现编码错误，用“？”替换编码错误的字符。
xmlcharrefreplace: 如果出现编码错误，用XML引用替换编码错误的字符。

encode() 方法可以将字符串转换为指定编码的字节序列。Python 3默认使用UTF-8编码。encode() 方法的语法如下：
str.encode(encoding=‘UTF-8’,errors=‘strict’)
其中，encoding 表示要使用的编码类型，errors 表示对编码错误的处理方式。

endswith() 方法可以帮助我们检查字符串是否以指定的后缀结尾。
Eg:

#检查是否以指定的后缀结尾
str1 = "Hello World"
print(str1.endswith("rld"))    # True
print(str1.endswith("ld", 5, 9))    # True
print(str1.endswith("ld", 5))    # False
print(str1.endswith("o", 2, 4))    # True
#获取文件名后缀并比较
filename = "test.txt"
if filename.endswith(".txt"):
   print("The file is a text file.")
elif filename.endswith(".jpg"):
   print("The file is an image.")

expandtabs() 用于将字符串中的 tab 键替换为指定数量的空格，默认情况下是以 8 个空格长度作为一个 tab 键的替代
str.expandtabs(tabsize)
Eg:

# 示例代码 1
str1 = "hello\tworld"
print(str1.expandtabs())  # "hello   world"

# 示例代码 2
str2 = "hello\tworld\t\t!!!"
print(str2.expandtabs(4))  # "hello   world       !!!"

# 示例代码 3
with open("test.txt", "r") as f:
    content = f.read()

processed_content = content.expandtabs(4)

with open("processed_test.txt", "w") as f:
    f.write(processed_content)

find() 用来定位子字符串在字符串中出现位置的一种方法。如果指定的子字符串出现在字符串中，该方法会返回它在字符串中的索引值；如果未找到，该方法会返回 -1。
Eg:

text = "I love Python programming language!"
position = text.find("Python")
if position != -1:
    print("Python found at position:", position)
else:
    print("Python not found")
Python
输出结果为：
Python found at position: 7

#指定查找位置
text = "I love Python programming language!"
position = text.find("Python", 10, 20)
if position != -1:
    print("Python found at position:", position)
else:
    print("Python not found")
Python
输出结果为：
Python not found

#下面是一个更复杂的示例，用于查找字符串中特定单词的出现次数：
text = "She sells sea shells by the sea shore, " \
       "The shells she sells are surely seashells, " \
       "So if she sells shells on the seashore, " \
       "I'm sure she sells seashore shells."
word = "shells"
count = 0
position = 0
while True:
    position = text.find(word, position)
    if position == -1:
        break
    count += 1
    position += len(word)
print("The word", word, "appeared", count, "times.")
Python
输出结果为：
The word shells appeared 4 times.

#在此示例中，我们使用了一个 while 循环，在每次运行中查找子字符串。
#我们还使用 position 变量来记录上一次查找结束的位置，以便下一次查找从该位置开始。

rfind() 返回子字符串的最后一个匹配项的索引。如果找不到匹配项，则返回-1

string = "Hello, World!"
# 搜索从索引7开始的字符串中的子字符串"o"
print(string.rfind("o", 7))
# 搜索从索引7开始到索引10结束的字符串中的子字符串"l"
print(string.rfind("l", 7, 10))
# 看看是否可以找到字符串"abc"的最后一个匹配项
print(string.rfind("abc"))
输出：
8
-1
-1

string = "I love Python Programming!"
#获取第一个空格之后的所有文本。
index = string.find(" ")
if index != -1:
    text = string[index+1:]
    print(text)
#找到字符串“Programming”最后一次出现的位置。
index = string.rfind("Programming")
print(index)
输出：
love Python Programming!
15

index() 在字符串中寻找某个子串的位置
str = “Hello World”
print(str.index(“World”)) # 输出 6
print(str.index(“l”, 4, 9)) # 输出 8
print(str.index(“Python”)) # 抛出 ValueError: substring not found

isalnum() 方法用于判断字符串是否由字母和数字组成，返回值为布尔型数据类型。
如果字符串中只包含字母和数字，返回True，否则返回False
str1 = “hello123”
str2 = “hello world”
str3 = “123”
str4 = " "
str5 = “helloworld”
print(str1.isalnum()) # 输出 True
print(str2.isalnum()) # 输出 False
print(str3.isalnum()) # 输出 True
print(str4.isalnum()) # 输出 False
print(str5.isalnum()) # 输出 True

isalpha() 该函数返回一个布尔值，如果字符串只包含字母，则返回 True，否则返回 False；只适用于 ASCII 代码中的字符，不支持其他语言的字符

isdigit() 方法用于判断字符串是否只包含数字字符

isnumeric() 方法用于检查字符串是否只包含数字字符
str1 = ‘12345’
str2 = ‘10a’
str3 = ‘一二三四五’
print(str1.isnumeric()) # Output: True
print(str2.isnumeric()) # Output: False
print(str3.isnumeric()) # Output: True
可以使用isnumeric()方法来计算给定字符串中数字字符的数量。
str1 = ‘1,2,3,4,5’
str2 = ‘10Vijay’
str3 = ‘一二三四五’
print(sum(char.isnumeric() for char in str1)) # Output: 5
print(sum(char.isnumeric() for char in str2)) # Output: 2
print(sum(char.isnumeric() for char in str3)) # Output: 5

islower() 方法用于判断字符串中所有的字母字符是否都是小写字母，该方法返回布尔值True或False。
若字符串中至少有一个非小写字母，则islower()方法返回False

isupper() 用于判断字符串中的所有字符是否都为大写字母

issapce() 检查一个字符串是否只包含空格

# 字符串只包含空格
str1 = "   "
print(str1.isspace())   # True

# 字符串包含空格和字母
str2 = "  Hello World  "
print(str2.isspace())   # False

# 字符串包含制表符
str3 = "\t"
print(str3.isspace())   # True

# 字符串包含换行符
str4 = "\n"
print(str4.isspace())   # True

# 字符串包含空格、制表符和换行符
str5 = " \t\n"
print(str5.isspace())   # True

title() 可将字符串中所有单词的首字母变为大写。

istitle() 方法是用来检查一个字符串中每个单词的首字母是否为大写的。如果一个字符串的每个单词的首字母都为大写，那么这个字符串就是一个 Title Case 字符串。
以下是使用 istitle() 方法时需要注意的几个事项：
istitle() 方法只能检查字符串中每个单词的首字母是否为大写，而不能保证这些单词是真正存在或拼写正确的。

str1 = "Hello World"
str2 = "Hello world"
str3 = "hello World"
str4 = "Hello World 123"
str5 = "Hi, How Are You Today?"

print(str1.istitle())  # True
print(str2.istitle())  # False
print(str3.istitle())  # False
print(str4.istitle())  # True
print(str5.istitle())  # True

join() 方法可以将列表（list）或元组（tuple）中的元素拼接成字符串（string）。

#使用 join() 将字符序列转换为字符串
name = ['J', 'o', 'h', 'n']
name_string = ''.join(name)
print(name_string)
输出：
John

#使用 join() 将元素序列转换为字符串
colors = ['Red', 'Green', 'Blue']
color_string = ', '.join(colors)
print(color_string)
输出：
Red, Green, Blue

#演示如何使用join()方法将一个列表中的数字字符串加起来：
number_str_list = ['1', '2', '3', '4', '5']
number_list = [int(number) for number in number_str_list]
sum_str = str(sum(number_list))
result_str = 'The sum of numbers in list is ' + sum_str
print(result_str)
输出：
The sum of numbers in list is 15

len() 是一个内置函数，它接受一个字符串作为参数，并返回该字符串的长度
下面是一个用len()函数来计算字符串中单词数量的示例：
my_string = “Learning Python is fun.”
word_count = len(my_string.split())
print(“There are " + str(word_count) + " words in the string.”)
输出：
There are 4 words in the string.

ljust() 方法是将字符串向左对齐，并用空格填充右侧空余的字符
string.ljust(width[, fillchar])
width: 一个整数，用于指定所需的宽度；
fillchar: 一个可选参数，用于指定填充时使用的字符。该参数默认为一个空格（’ ‘）。

#声明一个字符串
text = "Hello World"
#对文本进行左对齐，并用星号填充空余空间
result = text.ljust(20, '*')
#输出结果
print(result)
输出结果为：
Hello World*********

rjust() 返回一个左对齐的字符串，并使用指定的字符（默认为空格）填充左侧字符的空白部分，使其达到指定的宽度。

lower() 将字符串中的大写字母转化为小写字母

upper() 方法可以将所有小写字母转化为大写字母

swapcase() 方法的作用是将字符串中的大小写进行转换。具体来说，它会将字符串中的所有小写字母转换成大写字母，将所有大写字母转换成小写字母

lstrip() 方法删除字符串开头的特定字符；默认删除开头的空格符

string = "###hello world###"
print(string.lstrip("#"))
Python
这将输出以下内容：
hello world###
#使用lstrip()方法删除字符串开头的多个不同字符。
string = "%#%&Hello World!%#%&"
print(string.lstrip("%&#$"))
Python
输出结果如下：
'Hello World!%$#%&'
#删除带空格的
string = '**  **Hello World!****'
print(string.lstrip('* '))# Hello World!****

rstrip() 方法用来删除字符串右边的空格或指定字符。它不会改变字符串本身，而是返回一个新的字符串

#我们还可以使用rstrip()方法来删除换行符。
text = 'Hello World!\n'
print(text.rstrip())  # 输出：'Hello World!'

#当程序读取文本文件时，我们可以使用rstrip()方法来删除行末的换行符。
with open('filename.txt') as file:
    for line in file:
        print(line.rstrip())

maketrans() 是一个内置函数，用于创建两个字符串之间的字符映射
下面是maketrans()函数的语法：
str.maketrans(x[, y[, z]])
其中，
x：用作映射的字符串
y：另一个字符串，唯一指出要删除哪些字符
z：包含用于替换字符串x中对应位置的字符的字符串
如果只提供一个参数，该参数必须是字典。

# 将字符串中的元音字母替换为数字
str1 = "hello world"
str2 = "aeiou"
str3 = "12345"
trans_table = str.maketrans(str2, str3)
print(str1.translate(trans_table))
输出结果为：
h2ll4 w4rld

# 删除字符串中的所有数字
str1 = "1a2b3c4d5e6f"
trans_table = str.maketrans("", "", "0123456789")
print(str1.translate(trans_table))
输出结果为：
abcdef

max() 方法可以用来获取序列中最大的元素。对于字符串来说，就是获取其中ASCII码值最大的字符
min() 来查找最小字母，它返回字符串中的最小字母的Unicode位点

replace() 方法，替换字符串中的子字符串
str.replace(old, new[, count])
其中：
str 是要进行替换操作的字符串；
old 是要被替换的子字符串；
new 是替换的新字符串；
count 是可选的参数，指定替换的次数。
如果 count 没有被指定，默认情况下会替换所有匹配的子字符串。

# 示例1 – 简单替换
text = "Hello, World!"
new_text = text.replace("World", "Python")
print(new_text)
#运行上面的代码会输出：
Hello, Python!

#示例2 – 替换指定次数
text = "one two three four three three three"
new_text = text.replace("three", "3", 2)
print(new_text)
#运行上面的代码会输出：
one two 3 four 3 three three

#示例3 – 多个替换
text = "one two three four three three three"
new_text = text.replace("one", "1").replace("two", "2")\
        .replace("three", "3").replace("four", "4")
print(new_text)
#运行上面的代码会输出：
1 2 3 4 3 3 3

#示例4 – 复杂替换
#把一个字符串中包含的 URL 都替换成超链接。这个时候，我们可以使用正则表达式来匹配子字符串，并使用一个函数来生成替换的新字符串。示例如下：
import re
def replace_url(matched):
    url = matched.group("url")
    return "<a href='{url}'>{url}</a>".format(url=url)

text = "这是一个包含 URL 的文本，例如：http://www.google.com ，http://www.baidu.com 。"
new_text = re.sub("(?P<url>https?://\S+)", replace_url, text)
print(new_text)
#运行上面的代码会输出：
这是一个包含 URL 的文本，例如：<a href='http://www.google.com'>http://www.google.com</a> ，<a href='http://www.baidu.com'>http://www.baidu.com</a> 。

rindex() 方法用于从字符串的末尾开始查找指定的子字符串，并返回最后一次出现的位置。如果指定的子字符串不在字符串中，该方法将抛出一个ValueError异常。

str = 'hello world'
# 查找子字符串的位置，按照从右至左的顺序查找
index = str.rindex('o')
print("最后一个'o'的位置是：", index)
# 指定查找的位置范围
index = str.rindex('o', 0, 5)
print("最后一个'o'的位置是：", index)
# 指定起始位置，不指定结束位置
index = str.rindex('o', 7)
print("最后一个'o'的位置是：", index)

split() 可以将字符串拆分为一个包含子字符串的列表
语法
string.split(separator, maxsplit)
参数说明
separator（可选参数）：指定分隔符。默认值是空格。“split()”方法将根据指定的分隔符进行字符串划分。
maxsplit（可选参数）：指定划分多少个子字符串。默认值为“-1”，表示划分所有的子字符串。
可以通过使用负数表示从字符串的尾部开始进行分割。
返回值
返回一个由子字符串组成的列表。

splitlines() 方法用于分割字符串中的行，并返回一个由行组成的列表。可以将其用于分割多行文本，可以快速将其分割成单独的行

#示例1 – 用于分割多行文本
string = """Hello World!
             How are you today?
             I am feeling fine."""
print(string.splitlines())
输出：
['Hello World!', 'How are you today?', 'I am feeling fine.']

#示例2 – 用于文本解析
string = "apple, banana\norange, plum\npeach, pear"
parsed = [line.split(',') for line in string.splitlines()]
print(parsed)
输出：
[['apple', ' banana'], ['orange', ' plum'], ['peach', ' pear']]

#示例3 – 保留换行符
string = "Hello\nWorld\n"
print(string.splitlines(True))
输出：
['Hello\n', 'World\n']

startswith() 判断一个字符串是否以特定的子字符串开头
startswith() 方法定义如下：
str.startswith(prefix[, start[, end]])
prefix — 用于匹配字符串的开头子字符串。
start — 字符串中的开始索引，默认为0。
end — 字符串中的结束索引，默认为字符串长度。
#简单使用
str = “Hello, World!”
print(str.startswith(“Hello”)) # True
print(str.startswith(“H”)) # True
print(str.startswith(“o”)) # False
#带参数的调用
str = “this is string example…wow!!!”
print(str.startswith(‘string’, 8)) # True
print(str.startswith(‘is’, 2, 4)) # True
print(str.startswith(‘this’, 1, 2)) # False

strip() 可以用于切除字符串开头和结尾处的空格，以及其他指定的字符；
字符串strip()方法有两个可选参数：chars和maxsplit。这两个参数可以用来指定要从字符串开头和结尾删除的字符以及最大分割数的数量

# 删除字符串开头和结尾的空格符
str1 = "    hello world    "
str2 = str1.strip()
print(str2)
# 输出结果为：'hello world'

# 从字符串开头和结尾删除指定的字符
str3 = "___hello world___"
str4 = str3.strip('_')
print(str4)
# 输出结果为：'hello world'

# 删除开头和结尾的指定字符
string = "hello, world!!!"
result_string = string.strip(" !")
print(result_string)
# 输出结果为：'hello, world'

# 按换行符分割字符串
string = "Hello\nWorld"
result_string = string.strip().split('\n',0)
print(result_string)
# 输出结果为：['Hello', 'World']

# 按逗号分割字符串
string = "apple,banana,orange,melon"
result_string = string.strip().split(',',2)
print(result_string)
# 输出结果为：['apple', 'banana', 'orange']

# 去除数据集中的空格
data = ["\t100", "\n5 ", " 500  ", "\t\t 20 \n"]
result_data = [x.strip() for x in data]
print(result_data)
# 输出结果为：['100', '5', '500', '20']

# 验证并筛选特定位置
def verify_domain_name(domain_name):
    tld_list = ["com", "net", "ca", "org"]
    domain_splitted = domain_name.split(".")
    if domain_splitted[-1].strip() not in tld_list:
        return "Invalid TLD"
    if len(domain_splitted) < 2:
        return "Invalid domain name"
    if len(domain_splitted[-2].strip()) < 2:
        return "Invalid domain name"
    return True

domain = "www.example.com"
result = verify_domain_name(domain)
print(result)
# 输出结果为：True

ranslate() 可以替换或删除字符串中指定的字符或字符集，文本处理中非常实用

# 1
#创建一个字符映射表
transTable = str.maketrans("aeiou", "12345")
# 使用映射表替换字符串中的字符
string = "hello world"
translatedString = string.translate(transTable)
print(translatedString) # h2ll4 w4rld

#2
# 创建一个字符映射表和字符集
transTable = str.maketrans("", "", "0123456789")
# 使用映射表及字符集删除字符串中的数字
string = "a1b2c3d4e5f6g7"
translatedString = string.translate(transTable)
print(translatedString) # abcdefg

#3
# 创建一个字符映射表
transTable = str.maketrans("ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz")
# 使用映射表替换字符串中的所有大写字母为小写字母
string = "Hello World"
translatedString = string.translate(transTable)
print(translatedString) # hello world

#4
import re
# 使用正则表达式查找并替换字符串中的数字
string = "a1b2c3d4e5f6g7"
regex = re.compile(r'\d')
replacedString = regex.sub("", string)
print(replacedString) # abcdefg

zfill() 方法是Python 3中的一个字符串方法，它可以在字符串的左侧填充零，以达到指定长度
string = “spam”
print(string.zfill(10)) # 000000spam
print(string.ljust(10, “-”)) # spam------
print(string.rjust(10, “+”)) # +++++++spam

isdecimal() 是一种用于字符串的内置方法。该方法返回True或False，指示字符串是否只包含十进制数字字符。
sdecimal()方法可用于检查字符串是否只包含十进制数字字符。它在许多情况下是有用的，例如在验证用户输入时，可以检查他们提供的仅是数字值。此外，它还可以用于检查字符串中是否存在无效字符。

# 示例 1：只包含十进制数字字符的字符串
string = "1234567890"
print(string.isdecimal())
# 输出：
# True

# 示例 2：包含非数字字符的字符串
string = "12345 67890"
print(string.isdecimal())
# 输出：
# False

# 示例 3：空字符串
string = ""
print(string.isdecimal())
# 输出：
# False