背景
最近做一个项目练习,需要解析一些文件,最后输出一些报表,类似如下:
#### 1. 文章报表
| URL | 文章标题 | 访问人次 | 访问IP数 |
|------|----------|------------|----------|
|www.w3.org/1999/xhtml | 百度一下,你就知道 | 20 | 4 |
这个美观的格式在不少cli里都可以见到,查阅一些cli命令源码,发现 pt = prettytable.PrettyTable()用来输出表格,发现了 prettytable 这个库
安装
pip install PrettyTable
官方文档
使用说明
借鉴官方文档里的部分内容:
== Row by row == You can add data one row at a time. To do this you need to set the field names first using the `set_field_names` method, and then add the rows one at a time using the `add_row` method: x.set_field_names(["City name", "Area", "Population", "Annual Rainfall"]) x.add_row(["Adelaide",1295, 1158259, 600.5]) x.add_row(["Brisbane",5905, 1857594, 1146.4]) x.add_row(["Darwin", 112, 120900, 1714.7]) x.add_row(["Hobart", 1357, 205556, 619.5]) x.add_row(["Sydney", 2058, 4336374, 1214.8]) x.add_row(["Melbourne", 1566, 3806092, 646.9]) x.add_row(["Perth", 5386, 1554769, 869.4]) == Column by column == You can add data one column at a time as well. To do this you use the `add_column` method, which takes two arguments - a string which is the name for the field the column you are adding corresponds to, and a list or tuple which contains the column data" x.add_column("City name", ["Adelaide","Brisbane","Darwin","Hobart","Sydney","Melbourne","Perth"]) x.add_column("Area", [1295, 5905, 112, 1357, 2058, 1566, 5386]) x.add_column("Population", [1158259, 1857594, 120900, 205556, 4336374, 3806092, 1554769]) x.add_column("Annual Rainfall",[600.5, 1146.4, 1714.7, 619.5, 1214.8, 646.9, 869.4])
可以看到有很多灵活的方法,添加行,添加列等:
1 tb = PrettyTable(['URL', '文章标题', '访问人次', '访问IP数'])
2 tb.add_row(['www.w3.org/1999/xhtml', '百度一下,你就知道', 20, 4])
>>> print tb
+-----------------------+--------------------+----------+----------+
| URL | 文章标题 | 访问人次 | 访问IP数 |
+-----------------------+--------------------+----------+----------+
| www.w3.org/1999/xhtml | 百度一下,你就知道 | 20 | 4 |
+-----------------------+--------------------+----------+----------+
linux页面看着很美观了,复制粘贴出来就很丑了,可以自己操作下
还可以输出html文档
>>> html = tb.get_html_string()
>>> print html
1 <table> 2 <tr> 3 <th>URL</th> 4 <th>文章标题</th> 5 <th>访问人次</th> 6 <th>访问IP数</th> 7 </tr> 8 <tr> 9 <td>www.w3.org/1999/xhtml</td> 10 <td>百度一下,你就知道</td> 11 <td>20</td> 12 <td>4</td> 13 </tr> 14 </table>
功能很强大,更多了解建议看官方文档和源码
源码的分析
这篇文章主要是希望分析一下实现的原理,最早是想自己来写的,但是觉得Python强大的库肯定会支持,hahaha~
进入源码看看这个类定义的属性
"""Return a new PrettyTable instance
Arguments:
encoding - Unicode encoding scheme used to decode any encoded input
field_names - list or tuple of field names
fields - list or tuple of field names to include in displays
start - index of first data row to include in output
end - index of last data row to include in output PLUS ONE (list slice style)
header - print a header showing field names (True or False)
header_style - stylisation to apply to field names in header ("cap", "title", "upper", "lower" or None)
border - print a border around the table (True or False)
hrules - controls printing of horizontal rules after rows. Allowed values: FRAME, HEADER, ALL, NONE
vrules - controls printing of vertical rules between columns. Allowed values: FRAME, ALL, NONE
int_format - controls formatting of integer data
float_format - controls formatting of floating point data
padding_width - number of spaces on either side of column data (only used if left and right paddings are None)
left_padding_width - number of spaces on left hand side of column data
right_padding_width - number of spaces on right hand side of column data
vertical_char - single character string used to draw vertical lines
horizontal_char - single character string used to draw horizontal lines
junction_char - single character string used to draw line junctions
sortby - name of field to sort rows by
sort_key - sorting key function, applied to data points before sorting
valign - default valign for each row (None, "t", "m" or "b")
reversesort - True or False to sort in descending or ascending order"""
encoding - 用于解决编码
self.encoding = kwargs.get("encoding", "UTF-8") 可以看到是用的utf-8
field_names - 列名
self._field_names = [] 定义的是一个列表,打印的时候,遍历去打印 for field in self._field_names:
注册的时候可以在_init__里初始化:
def __init__(self, field_names=None, **kwargs): if field_names: self.field_names = field_names
也可以在add_row里继续添加
def add_row(self, row): """Add a row to the table Arguments: row - row of data, should be a list with as many elements as the table has fields""" if self._field_names and len(row) != len(self._field_names): raise Exception("Row has incorrect number of values, (actual) %d!=%d (expected)" %(len(row),len(self._field_names))) if not self._field_names: self.field_names = [("Field %d" % (n+1)) for n in range(0,len(row))] self._rows.append(list(row))
从add_row里可以看到每个_field_names里面保存的是个列表,每个列表里面保存一列名称
继续看代码实现,每一列的数据保存在_rows里面,调试一下,看看
>>> from prettytable import PrettyTable
>>> tb = PrettyTable(['URL', '文章标题', '访问人次', '访问IP数'])
>>> tb.add_row(['www.w3.org/1999/xhtml', '百度一下,你就知道', 20, 4])>>> tb.add_row(['www.qq.org/1999/xhtml', '腾讯网', 21, 4])
>>> print tb._rows
[['www.w3.org/1999/xhtml', '\xe7\x99\xbe\xe5\xba\xa6\xe4\xb8\x80\xe4\xb8\x8b\xef\xbc\x8c\xe4\xbd\xa0\xe5\xb0\xb1\xe7\x9f\xa5\xe9\x81\x93', 20, 4], ['www.qq.org/1999/xhtml', '\xe8\x85\xbe\xe8\xae\xaf\xe7\xbd\x91', 21, 4]]
上面add_row的动作可以看到,就是把一列添加到_rows里面:
self._rows.append(list(row))
其实分析到这里就差不多了,这个库最核心的两个数据结构就在这里,还有一些参数:
vertical_char—用于绘制垂直线的单个字符串
horizontal_char—用于绘制水平线的单个字符串
junction_char—用于绘制线连接的单个字符串
在绘制输出的表格时候,可以打印出不同的风格,
def set_style(self, style): if style == DEFAULT: self._set_default_style() elif style == MSWORD_FRIENDLY: self._set_msword_style() elif style == PLAIN_COLUMNS: self._set_columns_style() elif style == RANDOM: self._set_random_style() else: raise Exception("Invalid pre-set style!")
这个库提供了四种风格,大家有兴趣可以继续了解
用一个库的时候,可以顺带分析一下实现,毕竟Python的代码都是开源的,通过分析,可以看到并没有什么高深的地方,如果以后用其他语言想实现一个类似的功能,自己也可以轻松的造出一个好用的轮子