Python入门第十章---字符串

最新推荐文章于 2024-08-17 19:55:25 发布

记得加;

最新推荐文章于 2024-08-17 19:55:25 发布

阅读量36

点赞数

分类专栏： Python入门文章标签： python 开发语言

本文链接：https://blog.csdn.net/qq_45142203/article/details/131870396

版权

Python入门专栏收录该内容

17 篇文章 0 订阅

订阅专栏

定义

在Python中字符串是基本数据类型，是一个不可变的字符序列
字符串的定义可以用单引号’ 双引号" 三引号’‘’

a='python'
b="python"
c='''python'''
print(a,id(a))        #python 2633500350704
print(b,id(b))        #python 2633500350704
print(c,id(c))        #python 2633500350704

字符串的驻留机制

仅保存一份相同且不可变的字符串，Python的驻留机制对相同的字符串只保留一份拷贝，后续创建相同字符串时不会开辟新空间，而是把该字符串的地址赋给新创建的变量
驻留机制的几种情况（交互模式）（pycharm对字符串进行了优化处理）：

字符串的长度为0或1时；
符合标识符（由字母和下划线组成）的字符串
字符串只在编译时进行驻留，而非运行时
[-5,256]之间的整数数字

sys中的intern方法强制2个字符串指向同一个对象 a=sys.intern(b)
在这里插入图片描述

字符串的常用操作

查询

**index()**查找子串substr第一次出现的位置，如果查找的子串不存在时，则抛出ValueError

a='hello,hello'
print(a.index('e'))          #1
# print(a.index('a'))        #ValueError: substring not found

**rindex()**查找子串substr最后一次出现的位置，如果查找的子串不存在时，则抛出ValueError

print(a.rindex('e'))         #7
# print(a.rindex('a'))         #ValueError: substring not found

**find()**查找子串substr第一次出现的位置，如果查找的子串不存在时，则返回-1

print(a.find('he'))          #0
print(a.find('a'))          #-1

**rfind()**查找子串substr最后一次出现的位置，如果查找的子串不存在时，则返回-1

print(a.rfind('he'))         #6
print(a.rfind('a'))         #-1

大小写转换

（字符串是不可变序列，操作后会产生一个新的序列对象）
**upper()**把字符串中所有字符都转成大写字母

s='hEllo,woRld'
print('原字符串：',s,id(s))
s1=s.upper()
print('所有字符转成大写字母：',s1,id(s1))

**lower()**把字符串中所有字符都转成小写字母

s2=s.lower()
print('所有字符都转成小写字母：',s2)

**swapcase()**把字符串中所有大写字母都转成小写字母，把所有小写字母都转成大写字母

s3=s.swapcase()
print('大小写对换：',s3)

**capitalize()**把第一个字符转换成大写，把其余字符转换成小写

s4=s.capitalize()
print('第一个字符大写：',s4)            #第一个字符大写： Hello,world

**title()**把每个单词的第一个字符转换成大写，把每个单词的剩余字符转换成小写

s5=s.title()
print('每个单词首字母大写：',s5)         #每个单词首字母大写： Hello,World

内容对齐

**center()**居中对齐，第1个参数指定宽度，第2个参数指定填充符（可选，默认空格），如果设置宽度小于实际宽度则返回原字符

t='hello,world'            #原字符串宽度为11
print(t.center(20,'*'))     #****hello,world*****
print(t.center(10))         #hello,world（返回原字符串）

**ljust()**左对齐，第1个参数指定宽度，第2个参数指定填充符（可选，默认空格），如果设置宽度小于实际宽度则返回原字符

print(t.ljust(20,'*'))     #hello,world*********
print(t.ljust(5,'*'))      #hello,world

rjust()右对齐，第1个参数指定宽度，第2个参数指定填充符（可选，默认空格），如果设置宽度小于实际宽度则返回原字符

print(t.rjust(20))        #         hello,world
print(t.rjust(1))         #hello,world

zfill()右对齐，左边用0填充，该方法只接收一个参数，用于指定字符串的宽度，如果指定的宽度小于等于字符串的长度则返回原字符

print(t.zfill(20))        #000000000hello,world
print(t.zfill(3))         #hello,world
print('-987'.zfill(8))      #-0000987

劈分

split()

从字符串的左边开始劈分，默认的劈分字符是空格字符串，返回的值都是一个列表；
以通过参数sep指定劈分字符串是的劈分符；
通过参数maxsplit指定劈分字符串的最大劈分次数，在经过最大次劈分之后，剩余的子串会单独作为一部分

ss1='hello world python'
print(ss1.split())                    #['hello', 'world', 'python']

ss2='hello|world|python'
print(ss2.split())                          #['hello|world|python']
print(ss2.split(sep='|'))                   #['hello', 'world', 'python']

print(ss2.split(sep='|',maxsplit=1))         #['hello', 'world|python']

rsplit()
从字符串的右边开始劈分，默认的劈分字符是空格字符串，返回的值都是一个列表；
以通过参数sep指定劈分字符串是的劈分符；
通过参数maxsplit指定劈分字符串的最大劈分次数，在经过最大次劈分之后，剩余的子串会单独作为一部分

print(ss1.rsplit())                            #['hello', 'world', 'python']
print(ss2.rsplit(sep='|'))                     #['hello', 'world', 'python']
print(ss2.rsplit(sep='|',maxsplit=1))          #['hello|world', 'python']

判断字符串的方法

**isidentifier()**判断指定的字符串是不是合法的标识符
**isspace()**判断指定的字符串是否全部由空白字符组成（回车、换行、水平制表符）
**isalpha()**判断指定的字符串是否全部由字母组成
**isdecimal()**判断指定的字符串是否全部由十进制的数字组成
**isnumeric()**判断指定的字符串是否全部由数字组成
**isalnum()**判断指定字符串是否全部由字母和数字组成

i1='1!df'
print(i1.isidentifier())           #False
i2=' \t'
print(i2.isspace())
print(i1.isalpha())              #False
print('张三是字母吗','张三'.isalpha())       #True
i3='245'
print(i3.isdecimal())
print(i3.isnumeric())
print('22一'.isnumeric())       #True
print(i3.isalnum())

字符串的替换：replace(被替换的子串，替换子串的字符串，最大替换次数)

返回替换后得到的字符串，替换前的字符串不发生变化

r='hello world world'
print('替换前：',r,id(r))                     #替换前： hello world world 2285799559040
r1=r.replace('world','python')
print('替换后：',r1,id(r1))                   #替换后： hello python python 2285799559120
r2=r.replace('world','python',1)
print('替换一次后：',r2,id(r2))                 #替换一次后： hello python world 2285799559280

字符串的合并：join()

将列表或元组中的字符串合并成一个字符串

lst=['hello','world','python']
print('合并前：',lst)                       #合并前： ['hello', 'world', 'python']
print('|'.join(lst))                       #hello|world|python
print(''.join(lst))                        #helloworldpython

t=('hello','world','python')
print('合并前',t)                         #合并前 ('hello', 'world', 'python')
print('|'.join(t))                       #hello|world|python
print(''.join(t))                        #helloworldpython

print('#'.join('hello'))                 #h#e#l#l#o

字符串的比较操作

运算符：>，>=，<，<=，==，!=

比较原理：

两个字符进行比较时比较的是其ordinal value（原始值）
调用内置函数ord可以得到指定字符的ordinal value
与内置函数ord对应的是内置函数chr，调用内置函数chr时指定ordinal value可以得到其对应的字符

print('apple'>'app')      #True
print('apple'>'banana')   #False
print(ord('a'),ord('b'))  #97 98
print(ord('张'))           #24352

print(chr(97),chr(98))
print(chr(24352))

==与is的区别：

==比较的是value
is比较的是id

a=b='python'
c='python'
print(a==c)        #True
print(a is c)      #True
print(id(a))      #2237814740336
print(id(b))      #2237814740336
print(id(c))      #2237814740336

字符串的切片操作 [start:stop:step]

字符串是不可变类型，不具备增删改等操作，切片操作产生新的对象

s='hello,world'
s1=s[:5]
print(s1)          #hello
s2=s[6:] 
print(s2)          #world
s3='!'
new_str=s1+s3+s2
print(new_str)          #hello!world

print('s',id(s))            #s 1444329183600
print('s1',id(s1))          #s1 1444330083440
print('s2',id(s2))          #s2 1444330083568
print('s3',id(s3))          #s3 140719491942656
print('new_str',id(new_str))         #new_str 1444330083632

格式化字符串

%作占位符：‘%定义的格式化字符串’ % (实际值)

%s字符串
%i或%d整数
%f浮点数
‘%定义的格式化字符串’ % (实际值)

name='张三'
age=18
print('我叫%s，今年%d岁。' % (name,age))

{}占位符：‘{0\1}定义的格式化字符串’.format(实际值)

print('我是{0}，今年{1}岁了。'.format(name,age))

f-string：f’{变量名}定义的格式化字符串’

print(f'我叫{name}，今年{age}岁了。')

字符串的宽度和精度

‘%宽度.精度d/f’ % 字符串

print('%d' % 99)             #99
print('%10d' % 99)           #        99

print('%f' % 3.1415926)         #3.141593
print('%.2f' % 3.1415926)       #3.14

print('%10.2f' % 3.1415926)     #      3.14

‘{0:宽度.精度f}’.format(字符串)

不加f的精度表示一共几位数

print('{0}'.format(3.1425926))          #3.1425926
print('{0:.2}'.format(3.1425926))       #3.1               #.2表示的是一共2位数
print('{0:.2f}'.format(3.1425926))      #3.14
print('{0:10.2f}'.format(3.1425926))    #      3.14

字符串的编码转换

编码：变量名.encode(encoding=‘GBK\UTF-8’)

将字符串转换成二进制数据（bytes）

s='解铃还须系铃人'

s1=s.encode(encoding='GBK')           #在GBK这种编码格式中，一个中文占两个字节
print(s1)
s2=s.encode(encoding='UTF-8')         #在UTF-8这种编码格式中，一个中文占三个字节
print(s2)

解码：变量名.decode(encoding=‘GBK\UTF-8’)

将bytes类型的数据转换成字符串类型

print(s1.decode(encoding='GBK'))
print(s2.decode(encoding='UTF-8'))
# print(s1.decode(encoding='UTF-8'))    #UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 0: invalid start byte