第十一章标准库第二部分简介 python 导引编译之十二

最新推荐文章于 2024-09-14 19:55:48 发布

weixin_41670255

最新推荐文章于 2024-09-14 19:55:48 发布

阅读量149

点赞数

分类专栏：学习笔记文章标签： python

学习笔记专栏收录该内容

112 篇文章 5 订阅

订阅专栏

第十一章标准库第二部分简介 python 导引编译之十二

标题11.标准库第二部分简介 Brief Tour of the Standard Library — Part II

这个第二部分的简介覆盖了更高级一些的模块，以满足专业程序编制的需要。这些模块在小的原稿中很难见到。

标题11.1.输出格式化 Output Formatting

这个reprlib 模块提供了一个函数repr()，该函数为简化那些大的或者较深的内嵌容器而定制：

>>> import reprlib
>>> reprlib.repr(set('supercalifragilisticexpialidocious'))
"{'a', 'c', 'd', 'e', 'f', 'g', ...}"

这个pprint 模块给予了更复杂的打印控制，内置和用户定义的对象，如果它们是处在用解释器可读的方式之中，都受到这个模块的控制。当打印结果长于一行之时，加在行中的“好机器”文，中断并且缩行，更为清晰地揭示出数据结构。

>>> import pprint
>>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta',
...     'yellow'], 'blue']]]
...
>>> pprint.pprint(t, width=30)
[[[['black', 'cyan'],
   'white',
   ['green', 'red']],
  [['magenta', 'yellow'],
   'blue']]]

这个textwrap模块格式化文本的段落，以适应给定的屏幕宽度。

>>> import textwrap
>>> doc = """The wrap() method is just like fill() except that it returns
... a list of strings instead of one big string with newlines to separate
... the wrapped lines."""
...
>>> print(textwrap.fill(doc, width=40))
The wrap() method is just like fill()
except that it returns a list of strings
instead of one big string with newlines
to separate the wrapped lines.

这个locale 模块登录一个文化数据库以进行特指的数据格式化。现场格式化函数的群属性提供了一个直接方式，即用群分离格式化数的方式。

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')
'English_United States.1252'
>>> conv = locale.localeconv()          # get a mapping of conventions
>>> x = 1234567.8
>>> locale.format("%d", x, grouping=True)
'1,234,567'
>>> locale.format_string("%s%.*f", (conv['currency_symbol'],
...                      conv['frac_digits'], x), grouping=True)
'$1,234,567.80'

标题11.2. 模板Templating

字符串模块包括了一个多功能的模板类型，该模板类型用简化的语法来适合末端用户编辑。这也允许用户不用一定得修改该应用而定制它们的应用。
该格式使用占位符名称，这些名称是用有效的python标识符（文字数字和下划线）构成的。用括号括住的这些占位符允许更多的文字数字字母跟随，这些字母不占空间。写下两个$ $符号创建了一个单一的符号$ 。

>>> from string import Template
>>> t = Template('${village}folk send $$10 to $cause.')
>>> t.substitute(village='Nottingham', cause='the ditch fund')
'Nottinghamfolk send $10 to the ditch fund.'

这个替换substitute（）方法产生一个关键词错误，当占位符不是在某个词典或者关键字参数之中的时候。对于邮件合并风格的应用而言，提供数据的用户也许不完全，safe-substitute（）方法也许更为适合——如果数据迷失，这个方法将留下没有改变的占位符。

>>> t = Template('Return the $item to $owner.')
>>> d = dict(item='unladen swallow')
>>> t.substitute(d)
Traceback (most recent call last):
  ...
KeyError: 'owner'
>>> t.safe_substitute(d)
'Return the unladen swallow to $owner.'

模板subclasses可以指定一个定制的分隔符。例如，对于一个图片浏览器而言，一批重命名的功能也许选择使用百分比符号表示占位符，作为当前的数据。影像序列数或者文件格式。

>>> import time, os.path
>>> photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg']
>>> class BatchRename(Template):
...     delimiter = '%'
>>> fmt = input('Enter rename style (%d-date %n-seqnum %f-format):  ')
Enter rename style (%d-date %n-seqnum %f-format):  Ashley_%n%f

>>> t = BatchRename(fmt)
>>> date = time.strftime('%d%b%y')
>>> for i, filename in enumerate(photofiles):
...     base, ext = os.path.splitext(filename)
...     newname = t.substitute(d=date, n=i, f=ext)
...     print('{0} --> {1}'.format(filename, newname))

img_1074.jpg --> Ashley_0.jpg
img_1076.jpg --> Ashley_1.jpg
img_1077.jpg --> Ashley_2.jpg

另一种模板的运用是从多重输出格式的细节里分隔开程序逻辑。这使得有可能用定制模板来替换XML文件，普通文本报告和HTML网页报告。

标题11.3.用二进制数据记录图版发挥功能Working with Binary Data Record Layouts

这个结构struct模块提供两个函数pack（）和unpack（），以便用不同长度的二进制记录格式发挥功能。以下例子显示在一个ZIP文件中通过头部信息实现循环，而不用zipfile模块。编码“H"和”I"打包表示两个和四个字节，分别为没有符号的数字。那个’<"符号则指谓它们是标准的规模，几乎就没有字符次序：

import struct

with open('myfile.zip', 'rb') as f:
    data = f.read()

start = 0
for i in range(3):                      # show the first 3 file headers
    start += 14
    fields = struct.unpack('<IIIHH', data[start:start+16])
    crc32, comp_size, uncomp_size, filenamesize, extra_size = fields

    start += 16
    filename = data[start:start+filenamesize]
    start += filenamesize
    extra = data[start:start+extra_size]
    print(filename, hex(crc32), comp_size, uncomp_size)

    start += extra_size + comp_size     # skip to the next header

标题11.4. 多线程Multi-threading

线程是一个退耦任务的技术，从而这个任务不是附加的。线程可以用作改善应用响应，这种响应接受用户输入，当其它任务在这个背景下运行的时候。一个相关的使用案例是在另一个线程中平行于计算而运行I/O。
以下编码显示高线程模块如何能够在主程序继续运行的背景下运行任务。线程可以用来改善应用响应，该响应接受用户输入，当其它任务在这个背景下运行的时候。一个相关使用案例是在另一个线程中平行运算条件下运行I/O。

import threading, zipfile

class AsyncZip(threading.Thread):
    def __init__(self, infile, outfile):
        threading.Thread.__init__(self)
        self.infile = infile
        self.outfile = outfile

    def run(self):
        f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)
        f.write(self.infile)
        f.close()
        print('Finished background zip of:', self.infile)

background = AsyncZip('mydata.txt', 'myarchive.zip')
background.start()
print('The main program continues to run in foreground.')

background.join()    # Wait for the background task to finish
print('Main program waited until background was done.')

多重线程应用的主要挑战是协调一些线程，这些线程分享数据或者其它资源。到达末端，这个线程模块提供了许多同步的原初数据，包括锁locks，事件event，条件变元和旗语信号semaphore。
当那些工具启动时，一些小设计缺陷可以产生难以复制的问题。所以，协调任务的优选方法是把所有的通道集中起来，在一个单一的线程中，然后使用队列模块queue去反馈对其它线程的请求。对内-线程通讯和协调使用queue模块的这类应用很容易设计，更为可读，也更可靠。

标题11.5. 记录Logging

记录模块给出了一个完全特性并且柔软可塑的记录系统。用最简单的，记录信息被送到一个文件或者送到sys.stder：

import logging
logging.debug('Debugging information')
logging.info('Informational message')
logging.warning('Warning:config file %s not found', 'server.conf')
logging.error('Error occurred')
logging.critical('Critical error -- shutting down')

这产生以下输出

WARNING:root:Warning:config file server.conf not found
ERROR:root:Error occurred
CRITICAL:root:Critical error -- shutting down

借助缺省，报道的和调试的信息阻挡住了，输出被送到标准的错误。其它的输出选项包括通过电邮电报、套接口或者一个HTTP服务器发来的路由信息。新滤出可以选择不同的基于信息优先的不同路由：除错DEBUG，综合网络INFO，警告WARNONG，错误ERROR，和批评CRITICAL。
记录系统可以直接从python配置，或者，从一个用户编辑的配置文件下载用来定制记录而不用改变其运用。

标题11.6. 弱参考Weak References

python自动进行存储管理（对大多数对象和垃圾消除循环的记录材料）。该存储在被消除的最后记录之后很快就释放了。
这个方法对大多数应用但又是偶尔地，有很好的功能，仅只要这类方法被其它东西使用时才有回溯对象的需要。不幸的是，回溯它们产生一个使其永久留住的参考物。弱参考模块weakref提供了追溯对象而无需产生永久参考的工具。当对象不再有需要的时候，它会从弱参考表格weakref table移除，还有一个回叫信号对弱参考对象触发。典型的运用包括了缓冲对象，创建这样的对象是昂贵的：

>>> import weakref, gc
>>> class A:
...     def __init__(self, value):
...         self.value = value
...     def __repr__(self):
...         return str(self.value)
...
>>> a = A(10)                   # create a reference
>>> d = weakref.WeakValueDictionary()
>>> d['primary'] = a            # does not create a reference
>>> d['primary']                # fetch the object if it is still alive
10
>>> del a                       # remove the one reference
>>> gc.collect()                # run garbage collection right away
0
>>> d['primary']                # entry was automatically removed
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    d['primary']                # entry was automatically removed
  File "C:/python39/lib/weakref.py", line 46, in __getitem__
    o = self.data[key]()
KeyError: 'primary'

标题11.7. 让列表发挥作用的工具Tools for Working with Lists

许多必须的数据结构可以配合内置列表类型。然而在有时候，也有这样的必要，用来备选操作以与不同的操作权衡相容。
批量模块array提供一个批量函数array（）对象，他像一个列表，仅储存同质数据的列表，但更为紧致的储存。以下实例显示：一个数字批量储存为二字节未指定符号的二元数字（类型编码为“H”）而非常用的16字节，每个条目都是python int对象的规则列表：

>>> from array import array
>>> a = array('H', [4000, 10, 700, 22222])
>>> sum(a)
26932
>>> a[1:3]
array('H', [10, 700])

集成模块提供了一个deque（）对象，该对象类似一个列表，带有快速的附加和从左边但在中间更慢查找的列表。这些对象适合于补充队列和扩宽最初的自由寻找：

>>> from collections import deque
>>> d = deque(["task1", "task2", "task3"])
>>> d.append("task4")
>>> print("Handling", d.popleft())
Handling task1
unsearched = deque([starting_node])
def breadth_first_search(unsearched):
    node = unsearched.popleft()
    for m in gen_moves(node):
        if is_goal(m):
            return m
        unsearched.append(m)

此外，可选列表操作，资料库也给出另外的工具，例如二分模块bisect，该模块带有操作列表分类的函数：

>>> import bisect
>>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]
>>> bisect.insort(scores, (300, 'ruby'))
>>> scores
[(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]

这个heapq模块提供用来实施基于规范列表的heaps的函数。其最小值的条目总保持在0的位置这对于重复登陆到最小元素，但并不需要运行全部列表类的应用是有用的。

>>> from heapq import heapify, heappop, heappush
>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>> heapify(data)                      # rearrange the list into heap order
>>> heappush(data, -5)                 # add a new entry
>>> [heappop(data) for i in range(3)]  # fetch the three smallest entries
[-5, 0, 1]

标题11.8. 十进制浮点算术Decimal Floating Point Arithmetic

十进制模块给出了一个表示十进制浮点算术的十进制数据类型。把它与二进制的内置浮点操作相比较，这个类别都是很有帮助的，例如：
对于金融上的运用还有其它一些要求精确十进制表述的运用，
对精确度的控制，
对适用于法律要求和规范性要求的控制，
追溯有意义的十进制位置，或者
用户期待匹配用手工计算获得的结果的那些应用。
例如，计算对于一个70分的电话付费应交的5%税费，在十进制浮点和二进制浮点的计算有不同的结果，如果这个结果是对最近的分数的大约数，这个差异就是有意义的。

>>> from decimal import *
>>> round(Decimal('0.70') * Decimal('1.05'), 2)
Decimal('0.74')
>>> round(.70 * 1.05, 2)
0.73

这个十进制结果保持了趋向零的，从被乘数的二位数意义自动导出四位数意义。十进制复制了数学家用人工做的事情，并且避免了可能产生的问题，当二进制浮点不能够表达十进制量的时候产生的问题。
精确的表述使得十进制类别去操作模块计算和同样的测试，即那些不适合二进制浮点的计算和测试：

>>> Decimal('1.00') % Decimal('.10')
Decimal('0.00')
>>> 1.00 % 0.10
0.09999999999999995

>>> sum([Decimal('0.1')]*10) == Decimal('1.0')
True
>>> sum([0.1]*10) == 1.0
False

十进制模块提供了一种算术，和所需要的精确性同等的算术

>>> getcontext().prec = 36
>>> Decimal(1) / Decimal(7)
Decimal('0.142857142857142857142857142857142857')

weixin_41670255

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录