day5_常用模块

最新推荐文章于 2024-09-27 10:11:28 发布

Ali--

最新推荐文章于 2024-09-27 10:11:28 发布

阅读量829

点赞数

分类专栏： python_基础文章标签： python

本文链接：https://blog.csdn.net/liyanan8514/article/details/78630135

版权

python_基础专栏收录该内容

7 篇文章 0 订阅

订阅专栏

模块

定义：用来从逻辑上组织python代码（变量，函数，类，逻辑：实现一个功能），本质就是.py结尾的python文件（文件名：test.py，对应的模块名：test）
“包”的定义：用来从逻辑上组织模块的，本质就是一个目录（必须带有一个init.py文件）

模块导入方法

import module_name
import module1_name,module2_name
from module_alex import *
from module_alex import m1,m2,m3
from module_alex import logger as logger_alex

本质
导入模块本质就是把python文件解释一遍。
导入包的本质就是执行该包下的init.py文件。
模块的分类：
1、标准库
2、开源模块
3、自定义模块

time模块

时间戳(timestamp)：通常来说，时间戳表示的是从1970年1月1日00:00:00开始按秒计算的偏移量。我们运行“type(time.time())”，返回的是float类型。
格式化的时间字符串(Format String)
结构化的时间(struct_time)：struct_time元组共有9个元素共九个元素:(年，月，日，时，分，秒，一年中第几周，一年中第几天，夏令时)

time.time返回当前时间的时间戳(1970年纪元后经过的浮点秒数)

print(time.time())
#输出
1507362268.5010645

time.localtime返回本地时间的struct _time的格式的对象

x = time.localtime()
print(x)
#输出
time.struct_time(tm_year=2017, tm_mon=10, tm_mday=7, tm_hour=15, tm_min=49, tm_sec=4, tm_wday=5, tm_yday=280, tm_isdst=0)

time.gmtime返回当前utc时间(伦敦时间)

x = time.gmtime()
print(x)
#输出
time.struct_time(tm_year=2017, tm_mon=10, tm_mday=7, tm_hour=7, tm_min=51, tm_sec=37, tm_wday=5, tm_yday=280, tm_isdst=0)

time.asctime返回时间格式

print(time.asctime())
#输出
Sat Oct  7 15:54:05 2017

time.strptime把时间格式的字符串转成struct_time格式的时间对象

print(time.strptime("2017-10-10 18:18","%Y-%m-%d %H:%M"))
#输出
time.struct_time(tm_year=2017, tm_mon=10, tm_mday=10, tm_hour=18, tm_min=18, tm_sec=0, tm_wday=1, tm_yday=283, tm_isdst=-1)

time.mktime把struct_time时间对象转成时间戳

x = time.strptime("2017-10-10 18:18","%Y-%m-%d %H:%M")

print(time.mktime(x))
#输出
1507630680.0

time.strftime时间对象转换成时间字符串

print(time.strftime("%Y-%m-%d %H:%M.log"))
#输出
2017-10-07 16:03.log

时间格式转换

这里写图片描述

random模块

random.random() 随机返回一个小数

>>> random.random()
0.44309536784055825
>>> random.random()
0.9474594527425027

random.randint(a,b)随机返回a到b之间任意一个数，包括b

>>> random.randint(1,5)
1
>>> random.randint(1,5)
2
>>> random.randint(1,5)
5
>>> random.randint(1,5)
5

random.randrange(1,5)大于等于1且小于5之间的整数

>>> random.randrange(1,5)
1
>>> random.randrange(1,5)
4

random.choice1或者23或者[4,5]

>>> print(random.choice([1,'23',[4,5]]))
[4, 5]
>>> print(random.choice([1,'23',[4,5]]))
1
>>> print(random.choice([1,'23',[4,5]]))
23

random.sample(a, b)从a中随机获取b个值，以列表的形式返回

>>> random.sample(range(10),3)
[5, 3, 7]
>>> random.sample(range(10),3)
[7, 8, 6]
>>> random.sample(range(10),3)
[4, 7, 5]

生成随机数

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#Author:liyanan

import random
checkcode=''
for i in range(4):
    cuurrent = random.randrange(0,4)
    if cuurrent ==i:
        tmp = chr(random.randint(65,90))
    else:
        tmp = random.randint(0,9)
    checkcode+=str(tmp)
print(checkcode)

os模块

os模块是与操作系统交互的一个接口

os.getcwd() 获取当前工作目录，即当前python脚本工作的目录路径
os.chdir("dirname")  改变当前脚本工作目录；相当于shell下cd
os.curdir  返回当前目录: ('.')
os.pardir  获取当前目录的父目录字符串名：('..')
os.makedirs('dirname1/dirname2')    可生成多层递归目录
os.removedirs('dirname1')    若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推
os.mkdir('dirname')    生成单级目录；相当于shell中mkdir dirname
os.rmdir('dirname')    删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname
os.listdir('dirname')    列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印
os.remove()  删除一个文件
os.rename("oldname","newname")  重命名文件/目录
os.stat('path/filename')  获取文件/目录信息
os.sep    输出操作系统特定的路径分隔符，win下为"\\",Linux下为"/"
os.linesep    输出当前平台使用的行终止符，win下为"\t\n",Linux下为"\n"
os.pathsep    输出用于分割文件路径的字符串 win下为;,Linux下为:
os.name    输出字符串指示当前使用平台。win->'nt'; Linux->'posix'
os.system("bash command")  运行shell命令，直接显示
os.environ  获取系统环境变量
os.path.abspath(path)  返回path规范化的绝对路径
os.path.split(path)  将path分割成目录和文件名二元组返回
os.path.dirname(path)  返回path的目录。其实就是os.path.split(path)的第一个元素
os.path.basename(path)  返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素
os.path.exists(path)  如果path存在，返回True；如果path不存在，返回False
os.path.isabs(path)  如果path是绝对路径，返回True
os.path.isfile(path)  如果path是一个存在的文件，返回True。否则返回False
os.path.isdir(path)  如果path是一个存在的目录，则返回True。否则返回False
os.path.join(path1[, path2[, ...]])  将多个路径组合后返回，第一个绝对路径之前的参数将被忽略
os.path.getatime(path)  返回path所指向的文件或者目录的最后存取时间
os.path.getmtime(path)  返回path所指向的文件或者目录的最后修改时间
os.path.getsize(path) 返回path的大小

sys模块

sys.argv           命令行参数List，第一个元素是程序本身路径
sys.exit(n)        退出程序，正常退出时exit(0)
sys.version        获取Python解释程序的版本信息
sys.maxint         最大的Int值
sys.path           返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值
sys.platform       返回操作系统平台名称

shutil模块

高级的文件、文件夹、压缩包处理模块

shutil.copyfileobj(fsrc, fdst)
将文件内容拷贝到另一个文件中

with open("f_old",'r',encoding="utf-8") as f1,\
    open("f_new","w",encoding="utf-8") as f2:
    shutil.copyfileobj(f1,f2)

shutil.copyfile(src, dst)
拷贝文件,目标文件无需存在

shutil.copyfile('f1.log', 'f2.log')

shutil.copymode(src, dst)
仅拷贝权限。内容、组、用户均不变，目标文件必须存在

shutil.copymode('f1.log', 'f2.log')

shutil.copystat(src, dst)
仅拷贝状态的信息，包括：mode bits, atime, mtime, flags，目标文件必须存在

shutil.copystat('f1.log', 'f2.log')

hutil.copy(src, dst)
拷贝文件和文件的权限

shutil.copy('f1.log', 'f2.log')

shutil.copy2(src, dst)
拷贝文件和文件的状态

shutil.copy2('f1.log', 'f2.log')

shutil.copytree(src, dst)
递归的去拷贝文件，相当于cp -r，目标目录不能存在，注意对folder2目录父级目录要有可写权限，ignore的意思是排除

shutil.copytree('folder1', 'folder2', ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))

shutil.rmtree(path[, ignore_errors[, onerror]])
递归的去删除文件，相当于：rm -fr

shutil.rmtree('folder1')

shutil.move(src, dst)
递归的去移动文件，它类似mv命令，其实就是重命名。

shutil.move('folder1', 'folder3')

shelve模块

之前的json和pickle，在python3中只能dump一次和load一次，不能dump多次，和load多次，但是我们真想要dump多次和load多次怎么办呢，并且能事项数据的持久化？shelve模块比pickle模块简单，只有一个open函数，返回类似字典的对象，可读可写;key必须为字符串，而值可以是python所支持的数据类型。

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#Author:liyanan

import shelve

with shelve.open("info.txt") as f:
    print(f['test'])
    print(f['info'])
    print(f["func"]("li",24))

xml处理模块

xml是实现不同语言或者程序之间进行数据交换的协议，跟json差不多，但是json使用起来更简单，不过，在json没有诞生，只能选择xml
xml的格式

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

用python操作xml

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")
root = tree.getroot()
print(root.tag)

#遍历xml文档
for child in root:
    print(child.tag, child.attrib)
    for i in child:
        print(i.tag,i.text)
#输出
data
country {'name': 'Liechtenstein'}
rank 2
year 2008
gdppc 141100
neighbor None
neighbor None
country {'name': 'Singapore'}
rank 5
year 2011
gdppc 59900
neighbor None
country {'name': 'Panama'}
rank 69
year 2011
gdppc 13600
neighbor None
neighbor None

#只遍历year 节点
for node in root.iter('year'):
    print(node.tag,node.text)
#输出
data
year 2008
year 2011
year 2011

修改和删除xm内容

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")
root = tree.getroot()

#修改
for node in root.iter('year'):
    new_year = int(node.text) + 1
    node.text = str(new_year)
    node.set("updated","yes")

tree.write("xmltest.xml")


#删除node
for country in root.findall('country'):
   rank = int(country.find('rank').text)
   if rank > 50:
     root.remove(country)

tree.write('output.xml')

configpaarser模块

在很多情况下，我们都需要修改配置文件，但是，有些配置文件，如mysql数据库的配置文件怎么修改,生产和修改常见配置文件的模块：configparser。
php.ini或者nginx.ini很多文件的格式：

[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes

[bitbucket.org]
User = hg

[topsecret.server.com]
Port = 50022
ForwardX11 = no

使用python实现：

import configparser

config = configparser.ConfigParser()
config["DEFAULT"] = {'ServerAliveInterval': '45',
                      'Compression': 'yes',
                     'CompressionLevel': '9'}

config['bitbucket.org'] = {}
config['bitbucket.org']['User'] = 'hg'
config['topsecret.server.com'] = {}
topsecret = config['topsecret.server.com']
topsecret['Host Port'] = '50022'     # mutates the parser
topsecret['ForwardX11'] = 'no'  # same here
config['DEFAULT']['ForwardX11'] = 'yes'
with open('example.ini', 'w') as configfile:
   config.write(configfile)

hashlib模块

hash：一种算法 ,3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法
三个特点：
1.内容相同则hash运算结果相同，内容稍微改变则hash值则变
2.不可逆推
3.相同算法：无论校验多长的数据，得到的哈希值长度固定。

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#Author:liyanan

import hashlib

m = hashlib.md5()

m.update('hello'.encode('utf-8'))
print(m.hexdigest())
m.update('liyanan'.encode('utf-8'))
print(m.hexdigest())

m2=hashlib.md5()
m2.update('helloliyanan'.encode('utf-8'))
print(m2.hexdigest())
#输出
5d41402abc4b2a76b9719d911017c592
63ea30488d421ef4aa783ba96a3f3322
63ea30488d421ef4aa783ba96a3f3322

由上面的代码可以看出，你读到最后一行的字符串的MD5值跟一下子读取所有内容的MD5值是一样的,这是为什么呢？其实这边update做了一个拼接功能。以上加密算法虽然依然非常厉害，但时候存在缺陷，即：通过撞库可以反解。所以，有必要对加密算法中添加自定义key再来做加密。

hash = hashlib.sha256('898oaFs09f'.encode('utf-8'))
hash.update('liyanan'.encode('utf-8'))
print(hash.hexdigest())
#输出
0fe2abe645f87b442fa1a58508efd5368e8cc526653bfdb9ff7cc2bcaac06752

python 还有一个 hmac 模块，它内部对我们创建 key 和内容进行进一步的处理然后再加密:

import hmac

h = hmac.new('liyanan'.encode('utf-8'))  #key
h.update('hello'.encode('utf-8')) #内容
print(h.hexdigest())

#输出
03d7e06a679fafa86fc10d36b5fbfbd0

注意：
要想保证hmac最终结果一致，必须保证：
1、hmac.new括号内指定的初始key一样
2、无论update多少次，校验的内容累加到一起是一样的内容。

longging模块

很多程序都有记录日志的需求，并且日志中包含的信息即有正常的程序访问日志，还可能有错误，警告等信息输出，python的logging模块提供了标准的日志接口，你可以通过它存储各种格式的日志，logging的日志可以分为debug，info，warning，error和critical 5个级别。
简单用法：
日志级别有五个，分别是：debug，info，warning，error和critical，其中debug级别最低，critical级别最高，级别越低，打印的日志等级越多。

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#Author:liyanan

import logging

logging.debug("logging debug")
logging.info("logging info")
logging.warning("logging warning")
logging.error("longging error")
logging.critical("longging critical")
#输出
WARNING:root:logging warning
ERROR:root:longging error
CRITICAL:root:longging critical

注：从上面可以看出，一个模块默认的日志级别是warning

日志写入文件

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#Author:liyanan

import logging
logging.basicConfig(filename="info.txt",level=logging.INFO) #输入文件名和日志级别

logging.debug("logging debug")
logging.info("logging info")
logging.warning("logging warning")
#输出到文件

INFO:root:logging info
WARNING:root:logging warning

这句中的level=loggin.INFO意思是，把日志纪录级别设置为INFO，也就是说，只有比日志是INFO或比INFO级别更高的日志才会被纪录到文件里，所以debug日志没有记录，如果想记录，则级别设置成debug也就是level=loggin.DEBUG

加入日期格式

import logging
logging.basicConfig(filename="info.txt",
                    level=logging.INFO,
format = '%(asctime)s %(module)s:%(levelname)s %(message)s',
datefmt='%m/%d/%Y %H:%M:%S %p'
                    ) #输入文件名和日志级别

logging.debug("logging debug")
logging.info("logging info")
logging.warning("logging warning")

#输出到文件
INFO:root:logging info
WARNING:root:logging warning
10/08/2017 11:01:04 AM ��־д���ļ�:INFO logging info
10/08/2017 11:01:04 AM ��־д���ļ�:WARNING logging warning
10/08/2017 11:02:04 AM ��־д���ļ�:INFO logging info
10/08/2017 11:02:04 AM ��־д���ļ�:WARNING logging warning

format的日志格式

%(name)s
Logger的名字
%(levelno)s
数字形式的日志级别
%(levelname)s
文本形式的日志级别
%(pathname)s
调用日志输出函数的模块的完整路径名，可能没有
%(filename)s
调用日志输出函数的模块的文件名
%(module)s
调用日志输出函数的模块名
%(funcName)s
调用日志输出函数的函数名
%(lineno)d
调用日志输出函数的语句所在的代码行
%(created)f
当前时间，用UNIX标准的表示时间的浮点数表示
%(relativeCreated)d
输出日志信息时的，自Logger创建以来的毫秒数
%(asctime)s
字符串形式的当前时间。默认格式是 “2003-07-08 16:49:45,896”。逗号后面的是毫秒
%(thread)d
线程ID。可能没有
%(threadName)s
线程名。可能没有
%(process)d
进程ID。可能没有
%(message)s
用户输出的消息

复杂日志的输出
之前的写法感觉要么就输入在屏幕上，要么就是输入在日志里面，那我们有没有既可以输出在日志上，又输出在日志里面呢？很明显，当然可以。下面我们就来讨论一下，如何使用复杂的日志输出。

python使用logging模块记录日志涉及的四个主要类：
logger：提供了应用程序可以直接使用的接口。
handler：将(logger创建的)日志记录发送到合适的目的输出。
filter：提供了细度设备来决定输出哪条日志记录。
formatter：决定日志记录的最终输出格式。

re模块

就其本质而言，正则表达式（或 RE）是一种小型的、高度专业化的编程语言，（在Python中）它内嵌在Python中，并通过 re 模块实现。你可以为想要匹配的相应字符串集指定规则；该字符串集可能包含英文语句、e-mail地址、TeX命令或任何你想搞定的东西。然后你可以问诸如“这个字符串匹配该模式吗？”或“在这个字符串中是否有部分匹配该模式呢？”。你也可以使用 RE 以各种方式来修改或分割字符串。

常用的正则表达式符号

'.'     默认匹配除\n之外的任意一个字符，若指定flag DOTALL,则匹配任意字符，包括换行
'^'     匹配字符开头，若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)
'$'     匹配字符结尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以
'*'     匹配*号前的字符0次或多次，re.findall("ab*","cabb3abcbbac")  结果为['abb', 'ab', 'a']
'+'     匹配前一个字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']
'?'     匹配前一个字符1次或0次
'{m}'   匹配前一个字符m次
'{n,m}' 匹配前一个字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']
'|'     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 结果'ABC'
'(...)' 分组匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c
'\'     转义

[a-z]   匹配[a-z]
[A-Z]   匹配[A-Z]
[0-9]   匹配数字0-9
'\A'    只从字符开头匹配，re.search("\Aabc","alexabc") 是匹配不到的
'\Z'    匹配字符结尾，同$
'\d'    匹配数字0-9
'\D'    匹配非数字
'\w'    匹配[A-Za-z0-9]
'\W'    匹配非[A-Za-z0-9]
's'     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t'
'(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city")
结果:{'province': '3714', 'city': '81', 'birthday': '1993'}

常用方法

re.match 从头开始匹配
re.search 匹配包含
re.findall 把所有匹配到的字符放到以列表中的元素返回
re.splitall 以匹配到的字符当做列表分隔符
re.sub      匹配字符并替换

反斜杠
与大多数编程语言相同，正则表达式里使用”\”作为转义字符，这就可能造成反斜杠困扰。假如你需要匹配文本中的字符”\”，那么使用编程语言表示的正则表达式里将需要4个反斜杠”\\”：前两个和后两个分别用于在编程语言里转义成反斜杠，转换成两个反斜杠后再在正则表达式里转义成一个反斜杠。Python里的原生字符串很好地解决了这个问题，这个例子中的正则表达式可以使用r”\”表示。同样，匹配一个数字的”\d”可以写成r”\d”。有了原生字符串，你再也不用担心是不是漏写了反斜杠，写出来的表达式也更直观。