字符串的大奥秘

最新推荐文章于 2024-07-21 21:29:30 发布

小白菜，是真的菜

最新推荐文章于 2024-07-21 21:29:30 发布

阅读量128

点赞数 2

分类专栏：小白的python升级打怪文章标签： python 字符串

本文链接：https://blog.csdn.net/qq_45691413/article/details/119255770

版权

小白的python升级打怪专栏收录该内容

16 篇文章 1 订阅

订阅专栏

小白如何成为python数据分析师

第8天---->字符串

定义

在Python程序中，如果我们把单个或多个字符用单引号或者双引号包围起来，就可以表示一个字符串。字符串中的字符可以是特殊符号、英文字母、中文字符、日文的平假名或片假名、希腊字母、Emoji字符等。

字符串是不可变类型（只读容器），所以不能通过索引运算修改字符串中的字符。

字符串，就是由零个或多个字符组成的有限序列:如下

a = 'hello, world'

字符串的运算

a = 'hello, world'
b = 'hello, world'

# 获取字符串长度
print(len(a))

# 循环遍历字符串(输出字符串中的每个元素）
for i in a:
    print(i)

# 重复运算
print(a * 5)

# 成员运算
print('or' in a)
print('wf' not in b)


# 比较运算（比较字符串的内容）(成员排序比较运算）
print(a == b)
print(a != b)

c = 'goodbye, world'
print(b > c)

# 切片，索引
print(a[0], a[-len(a)])
print(a[-1], a[len(a)-1])
print(a[4], a[-8])

# 切片
print(a[::-1])  # 反转
print(a[2:6])

# 字符串拼接 +
print(a + b)

字符串的方法

查找操作

如果想在一个字符串中从前向后查找有没有另外一个字符串，可以使用字符串的find或index方法。

s = 'hello, world!'

# find方法从字符串中查找另一个字符串所在的位置
# 找到了返回字符串中另一个字符串首字符的索引
print(s.find('or'))        # 8
# 找不到返回-1
print(s.find('shit'))      # -1
# index方法与find方法类似
# 找到了返回字符串中另一个字符串首字符的索引
print(s.index('or'))       # 8
# 找不到引发异常
print(s.index('shit'))     # ValueError: substring not found

在使用find和index方法时还可以通过方法的参数来指定查找的范围，也就是查找不必从索引为0的位置开始。find和index方法还有逆向查找（从后向前查找）的版本，分别是rfind和rindex，代码如下所示。

s = 'hello good world!'

# 从前向后查找字符o出现的位置(相当于第一次出现)
print(s.find('o'))       # 4
# 从索引为5的位置开始查找字符o出现的位置
print(s.find('o', 5))    # 7
# 从后向前查找字符o出现的位置(相当于最后一次出现)
print(s.rfind('o'))      # 12

性质判断及大小写相关操作

通过使用capitalize、title、upper、lower来实现获得字符串大小写操作·后的字符串。

可以通过字符串的startswith、endswith来判断字符串是否以某个字符串开头和结尾；还可以用is开头的方法判断字符串的特征，这些方法都返回布尔值，代码如下所示

a = 'hello, world'
# 大小写
print(a.upper())  # 全部大写(转大写）
print(a.lower())  # 转小写
print(a.capitalize())  # 首字母大写
print(a.title())  # 每个单词首字母大写

# 判断性质
b = 'abcdef123456'
print(b.isdigit())  # 判断字符串是不是数字
print(b.isalpha())  # 判断字符串是不是字母
print(b.isalnum())  # 判断字符串是不是数字和字母
print(b.isascii())  # 判断字符串是不是ascii排序

# 判断开头结尾
c = '世界与我唯一'
print(c.startswith('世界'))  # 判断开头
print(c.endswith('一'))  # 判断结尾

格式化字符串

在Python中，字符串类型可以通过center、ljust、rjust方法做居中、左对齐和右对齐的处理。

a = 'hello,world'
# 居中
print(a.center(80, '='))
# 右对齐
print(a.rjust(80, '$'))
# 左对齐
print(a.ljust(80, '*'))

b = '123'
# 零填充（在左边补0）
print(b.zfill(6))

c = 1237
d = 345256
# python 3.6引入的格式字符串的便捷方法。
print(f'{c} + {d} = {c + d:,}')
print(f'{c} / {d} = {c / d:.2%}')
print(f'{c} + {d} = {c + d:.2e}')
print(f'{c} + {d} = {c + d:.2f}')
print(f'{c} + {d} = {c + d:+.2f}')
print(f'{c} + {d} = {c + d:0>10d}')
print(f'{c} + {d} = {c + d:0<10d}')
print(f'{c} + {d} = {c + d:x>10d}')
print(f'{c} + {d} = {c + d:x<10d}')
print('{}+{}={}'.format(c, d, c + d))
print('%d+%d=%d' % (c, d, c + d))

运行结果：

在这里插入图片描述

如果需要进一步控制格式化语法中变量值的形式，可以参照下面的表格来进行字符串格式化操作

在这里插入图片描述

修剪操作

字符串的strip方法可以帮我们获得将原字符串修剪掉左右两端空格之后的字符串。这个方法非常有实用价值，通常用来将用户输入中因为不小心键入的头尾空格去掉，strip方法还有lstrip和rstrip两个版本，

值得注意的是：我们修剪字符串只能修剪字符串左右空格，而修剪不了字符串中间的空格。

Email = ' 1235854@qq.com  '
# 修剪字符串左右空格
print(Email.strip())
# 修剪字符串左空格
print(Email.lstrip())
# 修剪字符串右空格
print(Email.rstrip())

替换字符

比如遇到敏感词、特殊字符不能输出时，我们可以使用replace将其替换成其他元素。例如：

# 指定替换
content = '你TMD，WF'
print(content.strip().replace('TMD', '*'))

拆分、合并

split —> 把字符串进行拆分，变成一个列表
join —> 把列表元素进行合并，变成一个字符串

content = '你 是 好 人 ，会 更 好 的.'
content1 = content.replace('，', '').replace('.', '')
print(content1)
# 用空格拆分
words = content1.split()
print(words, len(words))


words = content1.split('，')
for word in words:
    print(words)


items = [
    '我',
    '是',
    '谁'
]
print(items)
# 使用空格链接字符。
print(''.join(items))

编码、解码

1. 选择字符集（编码）的时候，最佳的选择（也是默认的）是UTF-8编码。
2. 编码和解码的字符集要保持一致，否则就会出现乱码现象。
3. 不能用ISO-8859-1编码保存中文，否则会出现编码黑洞，中文变成?。
4. UTF-8是Unicode的一种实现方案，也一种变长的编码，
   最少1个字节（英文和数字），最多4个字节（Emoji），表示中文用3个字节。

a = '我爱你中国'
# b = a按gbk编码
b = a.encode('gbk')
print(type(b))
print(b)
c = b'\xce\xd2\xb0\xae\xc4\xe3\xd6\xd0\xb9\xfa'
# 解码方式
print(c.decode('gbk'))
# 如果编码解码方式不一致，python中通常会产生UnicodeDecodeError错误。或者乱码。
print(c.decode('utf-8'))

凯撒密码 - 通过对应字符的替换，实现对明文进行加密的一种方式。

"""
凯撒密码 - 通过对应字符的替换，实现对明文进行加密的一种方式。

abcdefghijklmnopqrstuvwxyz
defghijklmnopqrstuvwxyzabc

明文：attack at dawn.
密文：dwwdfn dw gdzq.

对称加密：加密和解密使用了相同的密钥 ---> AES。
非对称加密：加密和解密使用不同的密钥（公钥、私钥）---> RSA ---> 适合互联网应用。

"""

nums = 'attack at dawn'
# 生成对照码
table = str.maketrans(
    'abcdefghijklmnopqrstuvwxyz',
    'defghijklmnopqrstuvwxyzabc'
)
# 使用对照码编码。
print(nums.translate(table))   # dwwdfn dw gdzq

小白菜，是真的菜

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
字符串的大奥秘

小白如何成为python数据分析师第8天---->字符串定义在Python程序中，如果我们把单个或多个字符用单引号或者双引号包围起来，就可以表示一个字符串。字符串中的字符可以是特殊符号、英文字母、中文字符、日文的平假名或片假名、希腊字母、Emoji字符等。字符串是不可变类型（只读容器），所以不能通过索引运算修改字符串中的字符。字符串，就是由零个或多个字符组成的有限序列:如下a = 'hello, world'字符串的运算a = 'hello, world'b = 'hell
复制链接

扫一扫

专栏目录