Python字符串处理

最新推荐文章于 2020-05-12 11:30:55 发布

J.Reno

最新推荐文章于 2020-05-12 11:30:55 发布

阅读量228

点赞数

分类专栏： Python Linux云计算文章标签： Python 字符串处理

本文链接：https://blog.csdn.net/jreno/article/details/94639489

版权

Linux云计算同时被 2 个专栏收录

59 篇文章 0 订阅

订阅专栏

Python

16 篇文章 0 订阅

订阅专栏

python语法风格

>>> x = y = 10
>>> a, b = 10, 20
>>> x, y = (100, 200)
>>> m, n = [1, 2]
>>> a, b = b, a   # a和b的值互换

# python的关键字
>>> import keyword
>>> keyword.kwlist
>>> 'pass' in keyword.kwlist
True
>>> keyword.iskeyword('pass')
True

内建函数不是关键字，可以被覆盖，但是最好不覆盖它。

内建函数：https://docs.python.org/zh-cn/3/library/functions.html

合法标识符

python标识符字符串规则和其他大部分用C编写的高级语言相似
第一个字符必须是字母或下划线(_)
剩下的字符可以是字母和数字或下划线
大小写敏感

关键字

和其他的高级语言一样,python也拥有一些被称作关键这字的保留字符
任何语言的关键字应该保持相对的稳定,但是因为python是一门不断成长和进化的语言,其关键字偶尔会更新
关键字列表和iskeyword()函数都放入了keyword模块以便查阅

内建

除了关键字之外,python还有可以在任何一级代码使用的“内建”的名字集合,这些名字可以由解释器设置或使用
虽然built-in不是关键字,但是应该把它当作“系统保留字”

模块结构及布局

编写程序时,应该建立一种统一且容易阅读的结构,并将它应用到每一个文件中去

编程思路

思考程序运行方式：交互？非交互？

# filename: /etc/hosts
文件已存在，请重试
# filename: /tmp/abc.txt
请输入内容，输入end表示结束
(end to quit)> Hello World!
(end to quit)> the end.
(end to quit)> bye bye.
(end to quit)> end

分析程序有哪些功能，将功能写成函数，编写出大致的框架

def get_fname():
    '用于获取文件名'
    pass

def get_content():
    '用于获取文件内容，将文件内容以列表形式返回'
    pass

def wfile(fname, content):
    '将内容与到文件'
    pass

编写程序主体，依次调用各个函数

if __name__ == '__main__':
    fname = get_fname()
    content = get_content()
    wfile(fname, content)

完成每个函数功能。

序列对象

内建函数


函数	含义
list(iter)	把可迭代对象转换为列表
str(obj)	把obj对象转换成字符串
tuple(iter)	把一个可迭代对象转换成一个元组对象

len(seq):返回seq的长度
enumerate:接受一个可迭代对象作为参数,返回一个enumerate对象
reversed(seq):接受一个序列作为参数,返回一个以逆序访问的迭代器
sorted(iter):接受一个可迭代对象作为参数,返回一个有序的列表

# list用于转换成列表
>>> list('hello')
['h', 'e', 'l', 'l', 'o']
>>> list(range(1, 10))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list((10, 20, 30))
[10, 20, 30]

# tuple用于转换成元组
>>> tuple('hello')
('h', 'e', 'l', 'l', 'o')
>>> tuple(range(1, 10))
(1, 2, 3, 4, 5, 6, 7, 8, 9)
>>> tuple(['bob', 'tom', 'jerry'])
('bob', 'tom', 'jerry')

# str用于转成字符串
>>> str(100)
'100'
>>> str([100, 200])
'[100, 200]'

常用于序列对象的方法：

>>> from random import randint
>>> num_list = [randint(1, 100) for i in range(5)]
>>> num_list
[53, 95, 37, 50, 54]

# reversed用于翻转
>>> list(reversed(num_list))
[54, 50, 37, 95, 53]
>>> for i in reversed(num_list):
...     print(i)

# sort排序
>>> sorted(num_list)
[37, 50, 53, 54, 95]
>>> sorted(num_list, reverse=True)   # 降序
[95, 54, 53, 50, 37]

# enumerate返回下标和元素
>>> list(enumerate(num_list))
[(0, 53), (1, 95), (2, 37), (3, 50), (4, 54)]
>>> for data in enumerate(num_list):
...     print(data)
>>> for ind, num in enumerate(num_list):
...     print(ind, num)

字符串详解

序列


序列操作符	作用
seq[ind]	获得下标为ind的元素
seq[ind1:ind2]	获得下标从ind1到ind2间的元素集合
seq * expr	序列重复expr次
seq1 + seq2	连接序列seq1和seq2
obj in seq	判断obj元素是否包含在seq中
obj not in seq	判断obj元素是否不包含在seq中

字符编码：

ASCII（American Standard Code for Information Interchange，美国信息交换标准代码。

ISO-8859-1(Latin1)：欧洲常用的字符编码

中国采用的字符编码：GB2312 / GBK / GB18030

ISO国际标准化组织发布了标准的万国码：unicode。UTF8就是一种实现方式

字符串格式化


格式化字符	转换方式
%c	转换成字符
%s	优先用str()函数进行字符串转换
%d / %i	转成有符号十进制数
%o	转成无符号八进制数
%e / %E	转成科学计数法
%f / %F	转成浮点数

字符串可以使用格式化符号来表示特定含义


辅助指令	作用
*	定义宽度或者小数点精度
-	左对齐
+	在正数前面显示加号
<sp>	在正数前面显示空格
#	在八进制数前面显示零0,在十六进制前面显示’0x’或者’0X’
0	显示的数字前面填充0而不是默认的空格

format函数

使用位置参数

‘my name is {} ,age {}’.format(‘hoho’,18)

使用关键字参数

‘my name is {name},age is{age}’.format({‘name’:‘bob’, ‘age’:23})

填充与格式化

{:[填充字符][对齐方式 <^>][宽度]}

使用索引

‘name is {0[0]} age is {0[1]}’.format([‘bob’, 23])

>>> '%s is %s years old' % ('tom', 20)
'tom is 20 years old'
>>> '%s is %d years old' % ('tom', 20)
'tom is 20 years old'
>>> '%10s%8s' % ('name', 'age')   # 第一列宽度为10，第二列8
'      name     age'
>>> '%10s%8s' % ('tom', 20)
'       tom      20'
>>> '%-10s%-8s' % ('name', 'age')
'name      age     '
>>> '%-10s%-8s' % ('tom', 20)
'tom       20      '

>>> '%d' % (5 / 3)
'1'
>>> '%f' % (5 / 3)   # 浮点数
'1.666667'
>>> '%.2f' % (5 / 3)   # 保留两位小数
'1.67'
>>> '%6.2f' % (5 / 3)   # 总宽度为6，小数位2位
'  1.67'
>>> '%#o' % 10   # 8进制
'0o12'
>>> '%#x' % 10    # 16进制
'0xa'
>>> '%e' % 128000   # 科学计数法
'1.280000e+05'

通过字符串的format方法实现格式化（了解）

>>> '{} is {} years old'.format('tom', 20)
'tom is 20 years old'
>>> '{} is {} years old'.format(20, 'tom')
'20 is tom years old'
>>> '{1} is {0} years old'.format(20, 'tom')
'tom is 20 years old'
>>> '{0[1]} is {0[0]} years old'.format([20, 'tom'])
'tom is 20 years old'
>>> '{:<10}{:<8}'.format('tom', 20)  # 左对齐，宽度为10、8
'tom       20      '
>>> '{:>10}{:>8}'.format('tom', 20)   # 右对齐
'       tom      20'

原始字符串

>>> win_path = 'c:\temp'
>>> print(win_path)   # \t将被认为是tab
c:	emp
>>> win_path = 'c:\\temp'   # \\真正表示一个\
>>> print(win_path)
c:\temp
>>> wpath = r'c:\temp\new'  # 原始字符串，字符串中的字符都表示字面本身含义
>>> print(wpath)
c:\temp\new
>>> wpath
'c:\\temp\\new'

字符串方法

内建函数

string.capitalize():把字符串的第一个字符大写
string.center(width):返回一个原字符串居中,并使用空格填充至长度width 的新字符串
string.count(str, beg=0,end=len(string)):返回str在string里面出现的次数,如果beg或者end指定则返回指定范围内str出现的次数
string.endswith(obj, beg=0,end=len(string)):检查字符串是否以obj结束,如果beg或者end指定则检查指定的范围内是否以obj结束,如果是,返回True,否则返回False
string.islower():如果string中包含至少一个区分大小写的字符,并且所有这些字符都是小写,则返回True,否则返回False
string.strip():删除string 字符串两端的空白
string.upper():转换string 中的小写字母为大写
string.split(str="", num=string.count(str)):以str为分隔符切片string,如果num有指定值,则仅分隔num个子字符串

# 去除字符串两端空白字符
>>> ' \thello world!\n'.strip()
'hello world!'
# 去除字符串左边空白字符
>>> ' \thello world!\n'.lstrip()
'hello world!\n'
# 去除字符串右边空白字符
>>> ' \thello world!\n'.rstrip()
' \thello world!'

>>> hi = 'hello world'
>>> hi.upper()   # 将字符串中的小写字母转成大写
'HELLO WORLD'
>>> 'HELLO WORLD'.lower()   # 将字符串中的大写字母转成小写
'hello world'
>>> hi.center(30)   # 居中
'         hello world          '
>>> hi.center(30, '*')
'*********hello world**********'
>>> hi.ljust(30)
'hello world                   '
>>> hi.ljust(30, '#')
'hello world###################'
>>> hi.rjust(30, '@')
'@@@@@@@@@@@@@@@@@@@hello world'
>>> hi.startswith('h')   # 字符串以h开头吗？
True
>>> hi.startswith('he')
True
>>> hi.endswith('o')   # 字符串以o结尾吗？
False
>>> hi.replace('l', 'm')   # 把所有的l替换成m
'hemmo wormd'
>>> hi.replace('ll', 'm')
'hemo world'
>>> hi.split()   # 默认以空格进行切割
['hello', 'world']
>>> 'hello.tar.gz'.split('.')    # 以点作为分隔符切割
['hello', 'tar', 'gz']
>>> str_list = ['hello', 'tar', 'gz']
>>> '.'.join(str_list)   # 以点为分隔符进行字符串拼接
'hello.tar.gz'
>>> '-'.join(str_list)
'hello-tar-gz'
>>> ''.join(str_list)
'hellotargz'
>>> hi.islower()   # 判断字符串内的字母都是小写的吗？
True
>>> 'Hao123'.isupper()   # 字符串内的字母都是大写的吗？
False
>>> 'hao123'.isdigit()   # 所有的字符都是数字字符吗？
False
>>> '123'.isdigit()
True

编写一个程序,实现创建用户的功能
提示用户输入用户名
随机生成8位密码
创建用户并设置密码
将用户相关信息写入指定文件

import subprocess
import random
import string

passStr = string.ascii_letters + string.digits

#获得不重复的随机密码
def get_ranpass(n=8):
    passwd=''
    while True:
        p = random.choice(passStr)
        if p in passwd:
            continue
        passwd += p
        if len(passwd) == n:
            break
    return passwd
#获取用户名
def get_username():
    while True:
        username = input('请输入您要创建的用户名:')
        result = subprocess.run('id %s &>/dev/null'%username,shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
        if int(result.returncode) != 0:
            break
        print('您输入的用户已存在,请重新输入!')
    return username

#创建用户
def create_user(name,passwd):
    subprocess.run('useradd %s'%name,shell=True)
    subprocess.run("echo '%s' | passwd --stdin %s &>/dev/null"%(passwd,name),shell=True)
    with open('/my/user','a') as f:
        f.write('u:%s\tp:%s\n'%(name,passwd))
    print('创建成功,用户名跟密码保存在/my/user文件')

if __name__ == '__main__':
    name = get_username()
    passwd = get_ranpass()
    create_user(name,passwd)