sys模块
1 2 3 4 5 6 7 8 9 |
|
1 2 3 4 5 6 7 8 9 10 11 12 |
|
time与datetime模块
在python中,通常3种时间的表示
- 时间戳(timestamp):通常来说,时间戳表示的是从1970年1月1日00:00:00开始按秒计算的偏移量。我们运行“type(time.time())”,返回的是float类型。
- 格式化的时间字符串(Format String)
- 结构化的时间(struct_time):struct_time元组共有9个元素共九个元素:(年,月,日,时,分,秒,一年中第几周,一年中第几天,夏令时)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
1 2 3 4 5 6 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
random模块
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
os模块
os模块是与操作系统交互的一个接口
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
|
json & pickle序列化模块
eval内置方法可以将一个字符串转成python对象,不过,eval方法是有局限性的,对于普通的数据类型,json.loads和eval都能用,但遇到特殊类型的时候,eval就不管用了,所以eval的重点还是通常用来执行一个字符串表达式,并返回表达式的值。
一、什么是序列化
我们把对象(变量)从内存中变成可存储或传输的过程称之为序列化,在Python中叫pickling,在其他语言中也被称之为serialization,marshalling,flattening等等,都是一个意思。
二、为什么要序列化
- 持久保存状态
- 跨平台数据交互
三、什么是反序列化
把变量内容从序列化的对象重新读到内存里称之为反序列化
- 如果我们要在不同的编程语言之间传递对象,就必须把对象序列化为标准格式,比如XML,但更好的方法是序列化为JSON,因为JSON表示出来就是一个字符串,可以被所有语言读取,也可以方便地存储到磁盘或者通过网络传输。
- JSON不仅是标准格式,并且比XML更快,而且可以直接在Web页面中读取,非常方便。
- JSON表示的对象就是标准的JavaScript语言的对象,JSON和Python内置的数据类型对应如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
四、Pickle
Pickle的问题和所有其他编程语言特有的序列化问题一样,就是它只能用于Python,支持python所有的数据类型,有可能不同版本的Python彼此都不兼容,因此,只能用Pickle保存那些不重要的数据,不能成功地反序列化也没关系。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
shelve模块
shelve模块比pickle模块简单,只有一个open函数,返回类似字典的对象,可读可写;key必须为字符串,而值可以是python所支持的数据类型
1 2 3 4 5 6 7 8 |
|
xml模块
xml是实现不同语言或程序之间进行数据交换的协议,跟json差不多,但json使用起来更简单,不过,古时候,在json还没诞生的黑暗年代,大家只能选择用xml呀,至今很多传统公司如金融行业的很多系统的接口还主要是xml。
<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank updated="yes">2</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<rank updated="yes">5</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama">
<rank updated="yes">69</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>
xmltest.xml
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import xml.etree.ElementTree as ET
"""
############ 解析方式一 ############
str_xml = open('xmltest.xml', 'r').read()# 打开文件,读取XML内容
root = ET.XML(str_xml)# 将字符串解析成xml特殊对象,root代指xml文件的根节点
print(root.tag)#获取根节点的标签名
"""
############ 解析方式二 ############
tree = ET.parse("xmltest.xml")# 直接解析xml文件
root = tree.getroot()# 获取xml文件的根节点
print(root.tag)#获取根节点的标签名
# 遍历xml文档
for child in root:
print('========>', child.tag, child.attrib, child.attrib['name'])
for i in child:
print(i.tag, i.attrib, i.text) #标签 属性 文本
# 只遍历year 节点
for node in root.iter('year'):
print(node.tag, node.text)
# ---------------------------------------
import xml.etree.ElementTree as ET
tree = ET.parse("xmltest.xml")
root = tree.getroot()
# 修改
for node in root.iter('year'):
new_year = int(node.text) + 1
node.text = str(new_year)
node.set('updated', 'yes')
node.set('version', '1.0')
tree.write('test.xml')
# 删除node
for country in root.findall('country'):
rank = int(country.find('rank').text)
if rank > 50:
root.remove(country)
tree.write('output.xml')
"""
#在country内添加(append)节点year2
"""
import xml.etree.ElementTree as ET
tree = ET.parse("xmltest.xml")
root=tree.getroot()
for country in root.findall('country'):
for year in country.findall('year'):
if int(year.text) > 2000:
year2=ET.Element('year2')
year2.text='新年'
year2.attrib={'update':'yes'}
country.append(year2) #往country节点下添加子节点
tree.write('a.xml.swap')
"""
#自己创建xml文档
"""
import xml.etree.ElementTree as ET
new_xml = ET.Element("namelist")
name = ET.SubElement(new_xml, "name", attrib={"enrolled": "yes"})
age = ET.SubElement(name, "age", attrib={"checked": "no"})
sex = ET.SubElement(name, "sex")
sex.text = '33'
name2 = ET.SubElement(new_xml, "name", attrib={"enrolled": "no"})
age = ET.SubElement(name2, "age")
age.text = '19'
et = ET.ElementTree(new_xml) # 生成文档对象
et.write("test.xml", encoding="utf-8", xml_declaration=True)
ET.dump(new_xml) # 打印生成的格式
xml操作
class Element:
"""An XML element.
This class is the reference implementation of the Element interface.
An element's length is its number of subelements. That means if you
want to check if an element is truly empty, you should check BOTH
its length AND its text attribute.
The element tag, attribute names, and attribute values can be either
bytes or strings.
*tag* is the element name. *attrib* is an optional dictionary containing
element attributes. *extra* are additional element attributes given as
keyword arguments.
Example form:
<tag attrib>text<child/>...</tag>tail
"""
当前节点的标签名
tag = None
"""The element's name."""
当前节点的属性
attrib = None
"""Dictionary of the element's attributes."""
当前节点的内容
text = None
"""
Text before first subelement. This is either a string or the value None.
Note that if there is no text, this attribute may be either
None or the empty string, depending on the parser.
"""
tail = None
"""
Text after this element's end tag, but before the next sibling element's
start tag. This is either a string or the value None. Note that if there
was no text, this attribute may be either None or an empty string,
depending on the parser.
"""
def __init__(self, tag, attrib={}, **extra):
if not isinstance(attrib, dict):
raise TypeError("attrib must be dict, not %s" % (
attrib.__class__.__name__,))
attrib = attrib.copy()
attrib.update(extra)
self.tag = tag
self.attrib = attrib
self._children = []
def __repr__(self):
return "<%s %r at %#x>" % (self.__class__.__name__, self.tag, id(self))
def makeelement(self, tag, attrib):
创建一个新节点
"""Create a new element with the same type.
*tag* is a string containing the element name.
*attrib* is a dictionary containing the element attributes.
Do not call this method, use the SubElement factory function instead.
"""
return self.__class__(tag, attrib)
def copy(self):
"""Return copy of current element.
This creates a shallow copy. Subelements will be shared with the
original tree.
"""
elem = self.makeelement(self.tag, self.attrib)
elem.text = self.text
elem.tail = self.tail
elem[:] = self
return elem
def __len__(self):
return len(self._children)
def __bool__(self):
warnings.warn(
"The behavior of this method will change in future versions. "
"Use specific 'len(elem)' or 'elem is not None' test instead.",
FutureWarning, stacklevel=2
)
return len(self._children) != 0 # emulate old behaviour, for now
def __getitem__(self, index):
return self._children[index]
def __setitem__(self, index, element):
# if isinstance(index, slice):
# for elt in element:
# assert iselement(elt)
# else:
# assert iselement(element)
self._children[index] = element
def __delitem__(self, index):
del self._children[index]
def append(self, subelement):
为当前节点追加一个子节点
"""Add *subelement* to the end of this element.
The new element will appear in document order after the last existing
subelement (or directly after the text, if it's the first subelement),
but before the end tag for this element.
"""
self._assert_is_element(subelement)
self._children.append(subelement)
def extend(self, elements):
为当前节点扩展 n 个子节点
"""Append subelements from a sequence.
*elements* is a sequence with zero or more elements.
"""
for element in elements:
self._assert_is_element(element)
self._children.extend(elements)
def insert(self, index, subelement):
在当前节点的子节点中插入某个节点,即:为当前节点创建子节点,然后插入指定位置
"""Insert *subelement* at position *index*."""
self._assert_is_element(subelement)
self._children.insert(index, subelement)
def _assert_is_element(self, e):
# Need to refer to the actual Python implementation, not the
# shadowing C implementation.
if not isinstance(e, _Element_Py):
raise TypeError('expected an Element, not %s' % type(e).__name__)
def remove(self, subelement):
在当前节点在子节点中删除某个节点
"""Remove matching subelement.
Unlike the find methods, this method compares elements based on
identity, NOT ON tag value or contents. To remove subelements by
other means, the easiest way is to use a list comprehension to
select what elements to keep, and then use slice assignment to update
the parent element.
ValueError is raised if a matching element could not be found.
"""
# assert iselement(element)
self._children.remove(subelement)
def getchildren(self):
获取所有的子节点(废弃)
"""(Deprecated) Return all subelements.
Elements are returned in document order.
"""
warnings.warn(
"This method will be removed in future versions. "
"Use 'list(elem)' or iteration over elem instead.",
DeprecationWarning, stacklevel=2
)
return self._children
def find(self, path, namespaces=None):
获取第一个寻找到的子节点
"""Find first matching element by tag name or path.
*path* is a string having either an element tag or an XPath,
*namespaces* is an optional mapping from namespace prefix to full name.
Return the first matching element, or None if no element was found.
"""
return ElementPath.find(self, path, namespaces)
def findtext(self, path, default=None, namespaces=None):
获取第一个寻找到的子节点的内容
"""Find text for first matching element by tag name or path.
*path* is a string having either an element tag or an XPath,
*default* is the value to return if the element was not found,
*namespaces* is an optional mapping from namespace prefix to full name.
Return text content of first matching element, or default value if
none was found. Note that if an element is found having no text
content, the empty string is returned.
"""
return ElementPath.findtext(self, path, default, namespaces)
def findall(self, path, namespaces=None):
获取所有的子节点
"""Find all matching subelements by tag name or path.
*path* is a string having either an element tag or an XPath,
*namespaces* is an optional mapping from namespace prefix to full name.
Returns list containing all matching elements in document order.
"""
return ElementPath.findall(self, path, namespaces)
def iterfind(self, path, namespaces=None):
获取所有指定的节点,并创建一个迭代器(可以被for循环)
"""Find all matching subelements by tag name or path.
*path* is a string having either an element tag or an XPath,
*namespaces* is an optional mapping from namespace prefix to full name.
Return an iterable yielding all matching elements in document order.
"""
return ElementPath.iterfind(self, path, namespaces)
def clear(self):
清空节点
"""Reset element.
This function removes all subelements, clears all attributes, and sets
the text and tail attributes to None.
"""
self.attrib.clear()
self._children = []
self.text = self.tail = None
def get(self, key, default=None):
获取当前节点的属性值
"""Get element attribute.
Equivalent to attrib.get, but some implementations may handle this a
bit more efficiently. *key* is what attribute to look for, and
*default* is what to return if the attribute was not found.
Returns a string containing the attribute value, or the default if
attribute was not found.
"""
return self.attrib.get(key, default)
def set(self, key, value):
为当前节点设置属性值
"""Set element attribute.
Equivalent to attrib[key] = value, but some implementations may handle
this a bit more efficiently. *key* is what attribute to set, and
*value* is the attribute value to set it to.
"""
self.attrib[key] = value
def keys(self):
获取当前节点的所有属性的 key
"""Get list of attribute names.
Names are returned in an arbitrary order, just like an ordinary
Python dict. Equivalent to attrib.keys()
"""
return self.attrib.keys()
def items(self):
获取当前节点的所有属性值,每个属性都是一个键值对
"""Get element attributes as a sequence.
The attributes are returned in arbitrary order. Equivalent to
attrib.items().
Return a list of (name, value) tuples.
"""
return self.attrib.items()
def iter(self, tag=None):
在当前节点的子孙中根据节点名称寻找所有指定的节点,并返回一个迭代器(可以被for循环)。
"""Create tree iterator.
The iterator loops over the element and all subelements in document
order, returning all elements with a matching tag.
If the tree structure is modified during iteration, new or removed
elements may or may not be included. To get a stable set, use the
list() function on the iterator, and loop over the resulting list.
*tag* is what tags to look for (default is to return all elements)
Return an iterator containing all the matching elements.
"""
if tag == "*":
tag = None
if tag is None or self.tag == tag:
yield self
for e in self._children:
yield from e.iter(tag)
# compatibility
def getiterator(self, tag=None):
# Change for a DeprecationWarning in 1.4
warnings.warn(
"This method will be removed in future versions. "
"Use 'elem.iter()' or 'list(elem.iter())' instead.",
PendingDeprecationWarning, stacklevel=2
)
return list(self.iter(tag))
def itertext(self):
在当前节点的子孙中根据节点名称寻找所有指定的节点的内容,并返回一个迭代器(可以被for循环)。
"""Create text iterator.
The iterator loops over the element and all subelements in document
order, returning all inner text.
"""
tag = self.tag
if not isinstance(tag, str) and tag is not None:
return
if self.text:
yield self.text
for e in self:
yield from e.itertext()
if e.tail:
yield e.tail
xml的语法功能
xml教程的:点击
configparser模块
configparser用于处理特定格式的文件,本质上是利用open来操作文件,主要用于配置文件分析用的
配置文件如下
# 注释1
; 注释2
[section1]
k1 = v1
k2:v2
user=egon
age=18
is_admin=true
salary=31
[section2]
k1 = v1
配置文件如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
hashlib模块
hash是一种算法(3.x里代替了md5模块和sha模块,主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ,MD5 算法),该算法接受传入的内容,经过运算得到一串hash值
一、hash值的特点是
- 只要传入的内容一样,得到的hash值必然一样
- 不能由hash值返解成内容,不应该在网络传输明文密码
- 只要使用的hash算法不变,无论校验的内容有多大,得到的hash值长度是固定的
1 2 3 4 5 6 7 8 9 10 11 12 |
|
二、添加自定义key(加盐)
以上加密算法虽然依然非常厉害,但时候存在缺陷,即:通过撞库可以反解。所以,有必要对加密算法中添加自定义key再来做加密。
1 2 3 4 5 6 7 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
三、hmac模块
python 还有一个 hmac 模块,它内部对我们创建 key 和 内容 进行进一步的处理然后再加密:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
shutil模块
高级的 文件、文件夹、压缩包 处理模块
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
|
suprocess模块
suprocess模块的:官方文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
logging模块
logging模块的:官方文档
Python的logging模块提供了通用的日志系统,可以方便第三方模块或者是应用使用。这个模块提供不同的日志级别,并可以采用不同的方式记录日志,比如文件,HTTP GET/POST,SMTP,Socket等,甚至可以自己实现具体的日志记录方式。
一、日志级别
1 2 3 4 5 6 |
|
二、默认级别为warning,默认打印到终端
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
三、为logging模块指定全局配置,针对所有logger有效,控制打印到文件中
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
四、logging模块的Formatter,Handler,Logger,Filter对象
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
|
五、Logger与Handler的级别(logger是第一级过滤,然后才能到handler)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
六、实例代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
collections模块
collections模块在内置数据类型(dict、list、set、tuple)的基础上,还提供了几个额外的数据类型:ChainMap、Counter、deque、defaultdict、namedtuple和OrderedDict等。
- namedtuple: 生成可以使用名字来访问元素内容的tuple子类
- deque: 双端队列,可以快速的从另外一侧追加和推出对象
- Counter: 计数器,主要用来计数
- OrderedDict: 有序字典
- defaultdict: 带有默认值的字典
一、namedtuple
- namedtuple是一个函数,它用来创建一个自定义的tuple对象,并且规定了tuple元素的个数,并可以用属性而不是索引来引用tuple的某个元素。
- 它具备tuple的不变性,又可以根据属性来引用,使用十分方便。
1 2 3 4 5 6 |
|
二、deque
- 使用list存储数据时,按索引访问元素很快,但是插入和删除元素就很慢了,因为list是线性存储,数据量大的时候,插入和删除效率很低。
- deque是为了高效实现插入和删除操作的双向列表,适合用于队列和栈
- deque除了实现list的append()和pop()外,还支持appendleft()和popleft(),这样就可以非常高效地往头部添加或删除元素。
1 2 3 4 5 |
|
三、defaultdict
- 使用dict时,如果引用的Key不存在,就会抛出KeyError。如果希望key不存在时,返回一个默认值,就可以用defaultdict
- 注意默认值是调用函数返回的,而函数在创建defaultdict对象时传入。
- 除了在Key不存在时返回默认值,defaultdict的其他行为跟dict是完全一样的。
1 2 3 4 5 |
|
四、OrderedDict
- 使用dict时,Key是无序的。在对dict做迭代时,我们无法确定Key的顺序。如果要保持Key的顺序,可以用OrderedDict
- 注意,OrderedDict的Key会按照插入的顺序排列,不是Key本身排序
- OrderedDict可以实现一个FIFO(先进先出)的dict,当容量超出限制时,先删除最早添加的Key
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
五、Counter
- Counter是一个简单的计数器,例如,统计字符出现的个数
- Counter实际上也是dict的一个子类,上面的结果可以看出,字符'g'、'm'、'r'各出现了两次,其他字符各出现了一次。
1 2 3 4 5 |
|