[pyhton]python cookbook学习笔记

最新推荐文章于 2021-02-02 22:26:43 发布

l_xm

最新推荐文章于 2021-02-02 22:26:43 发布

阅读量892

点赞数

分类专栏： Python学习笔记文章标签： python cookbook

本文链接：https://blog.csdn.net/sinat_16968575/article/details/44998589

版权

Python学习笔记专栏收录该内容

24 篇文章 0 订阅

订阅专栏

字符串

测试一个对象是否是类字符串

isinstance
type(obj) == type(”)

try:
    obj + ''
except:
    return False
else:
    return True

字符串对齐
- string.ljust(width,fill)
- string.rjust(width,fill)
- string.center(width,fill)
去除字符串中包含的空格
- string.lstrip()
- string.rstrip()
- string.strip()

反转字符串

rewords = words[::-1]

rewords = words.split()
rewords.reverse()
rewords = ' '.join(rewords)

re.split(pattern,astring)

string.maketrans string.translate 过滤字符串中的某些字符
keep = ‘abcd’
all_chars = string.maketrans(”,”) #创建translate table
del_chars = all_chars.translate(all_chars,keep)
s.translate(all_chars,del_chars)

判断字符串是二进制还是文本

import string
from __future__  import division
text_chars = ''.join(map(chr,range(32,127)))
null_trans = string.maketrans('','')
def isText(s,text_chars=text_chars,threshold=0.3):
    if '\0' in s:
        return False
    if not s:
        retunr True
    not_text_chars =s.translate(null_trans,text_chars)
    return len(not_text_chars)/len(s) <= threshold

字符串方法：upper() lower() title() capitalize() isupper() islower() istitle()

访问(分割字符串)子字符串

struct.unpack('3s 5s 4s','asdqwertzxcv') #return ('asd', 'qwert', 'zxcv')

def cut(L,cuts):
    start = 0
    end = 0
    for x in cuts:
        end = start + x
        if end <= len(L):
            print L[start:end]
            start = end
        elif start < len(L):
            print L[start:]
            break
        else:
            break

重新调整行首空格

def reindent(s,numSpace):
    leading_space = numSpace * ' '
    lines = [leading_space + line.strip() for line in s.splitlines()]
    return '\n'.join(lines)

替换字符串中某些子字符串
‘%(name)s aegfwg’ % {‘name’:’replace’}

t = string.Template(‘$name agg’)
t.substitute({‘name’:’replace’})

文件

获取文件中某一行的内容
for line_num,line in enumerate(open(thefilepath,'r'),1): if line_num == the_desire_line_num: res = line

统计文本文件行数

len(open(thefilepath,'r').readlines())

count = -1
for count,line in enumerate(open(thefilepath,'r')):
    pass
count += 1

tempfile和zipfile的使用。从zipfile中导入模块

import os,sys,tempfile,zipfile
fd,filename = tempfile.mkstemp(suffix='.zip')
os.close(fd)
zf = zipfile.ZipFile(filename,'w')
zf.writestr('hello.py','def func():return "Hello world from " + __file__\n')
zf.close()
sys.path.insert(0,filename)
import hello
print hello.func()

向windows标准输出输出二进制数据
python在一般情况下是以文本模式打开sys.stdout的，要输出二进制数据可以使用msvcrt模块
```
import sys
    if sys.platform == 'win32':
        import msvcrt,os
        msvcrt.setmode(sys.stdout.fileno(),os.BINARY)
```

使用C++的类iostream的语法

class IOStream(object):
    def __init__(self,output=None):
        if output is None:
            import sys
            output = sys.stdout
        self.output = output
        self.format = '%s'

    def __lshift__(self,thing):
        if isinstance(thing,IOManipulator):
            thing.do(self)
        else:
            self.output.write(self.format % thing)
            self.format = '%s'
        return self

class IOManipulator(object):
    def __init__(self,callback_func):
        self.func = callback_func

    def do(self,stream):
        self.func(stream)


#处理换行

def do_endl(stream):
    stream.output.write('\n')
    stream.output.flush()
endl = IOManipulator(do_endl)


#处理字符串格式化，%x16进制

def format_hex(stream):
    stream.format = '%x'
fhex = IOManipulator(format_hex)


# ----test----------

def test():
    count = IOStream()
    count << 'like C++\'s class IOStream'

if __name__ == '__main__':
    test()

给定两个目录，计算出目录1相对于目录2的相对目录

例如：/a/b/c/d,/a/b/e/f/g，返回../../e/f/g

import os

def all_equal(elements):
    return len(set(elements)) == 1

def common_prefix(*sequences):
    common = []
    if not sequences:
        return [],[]
    for elements in zip(*sequences):
        if not all_equal(elements):
            break
        common.append(elements[0])
    return common,[sequence[len(common):] for sequence in sequences]

def relpath(path1,path2,sep=os.path.sep,pardir=os.path.pardir):
    common,[u1,u2] = common_prefix(path1.split(sep),path2.split(sep))
    if not common:
        return path2
    return sep.join([pardir] * len(u1) + u2)

def test(path1,path2,sep=os.path.sep):
    print 'from','<',path1,'>','to','<',path2,'>','==>',relpath(path1,path2)

if __name__ == '__main__':
    test('/root/etc/python27/read.txt','/root/etc/mysql/config.cfg','/')
    test('/home/lxy/a/b/c.txt','/home/lxy/r/n/g.txt','/')
    test(r'',r'C:\MinGW\include\_mingw.h')

文件版本化，在编辑文件之前，生成一个该文件的拷贝

def versionFile(file_spec,vtype='copy'):
    import os,shutil
    if os.path.isfile(file_spec):
        if vtype not in ('copy','rename'):
            raise ValueError,'Unknow vtype %r' % vtype
        root,ext = os.path.splitext(file_spec)
        if len(ext) == 4 and ext[1:].isdigit():
            version_num = int(ext[1:]) + 1
        else:
            version_num = 0
        for i in xrange(version_num,100):
            new_file = '%s.%03d' % (root,i)
            if not os.path.exists(new_file):
                if vtype == 'copy':
                    shutil.copy(file_spec,new_file)
                else:
                    os.rename(file_spec,new_file)
                print '%s successful' % vtype
                return True
        raise RuntimeError,'Can\'t %s %r,all names taken' % (vtype,file_spec)

    else:
        print '%s is not a file' % file_spec
        return False


if __name__ == '__main__':

    versionFile('test','rename')

时间

time模块简介

GMT时间表示格林威治时间，也就是UTC（世界标准时间）

time.gmttime([sec])–>sec表示从纪元（Epoch：1970/1/1 0:0:0）到现在的秒数，返回时间元组(tm_year,tm_mon,tm_mday,tm_hour,tm_min,
tm_sec,tm_wday,tm_yday,tm_isdst)。
time.localtime([sec])–>将世界标准时间转换为本地时区的时间。
time.asctime([tuple])–>tuple:时间元组，返回字符串格式的时间。若tuple缺省，默认使用time.localtime()返回的时间元组。

time.strftime(format[, tuple]) -> string
将时间元组转换为格式化字符串，tuple默认使用localtime()返回的时间元组。
format

Directive	Meaning	Notes
%a	Locale’s abbreviated weekday name.
%A	Locale’s full weekday name.
%b	Locale’s abbreviated month name.
%B	Locale’s full month name.
%c	Locale’s appropriate date and time representation.
%d	Day of the month as a decimal number [01,31].
%H	Hour (24-hour clock) as a decimal number [00,23].
%I	Hour (12-hour clock) as a decimal number [01,12].
%j	Day of the year as a decimal number [001,366].
%m	Month as a decimal number [01,12].
%M	Minute as a decimal number [00,59].
%p	Locale’s equivalent of either AM or PM.	(1)
%S	Second as a decimal number [00,61].	(2)
%U	Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0.	(3)
%w	Weekday as a decimal number [0(Sunday),6].
%W	Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0.	(3)
%x	Locale’s appropriate date representation.
%X	Locale’s appropriate time representation.
%y	Year without century as a decimal number [00,99].
%Y	Year with century as a decimal number.
%Z	Time zone name (no characters if no time zone exists).
%%	A literal ‘%’ character.

time.strptime(string[,format])
将按照format格式化好的字符串转换为时间元组

datetime模块

一些简单的例子

>>> import datetime
>>> today = datetime.date.today()
>>> today
datetime.date(2015, 4, 18)
>>> today + datetime.timedelta(days=1)
datetime.date(2015, 4, 19)
>>> print today + datetime.timedelta(days=1)
2015-04-19
>>> today = datetime.datetime.today()
>>> today
datetime.datetime(2015, 4, 18, 12, 57, 40, 115000)

寻找上一个星期五

>>> import datetime,calendar
>>> lastFriday = datetime.date.today()
>>> oneday = datetime.timedelta(days=1)
>>> while lastFriday.weekday() != calendar.FRIDAY:
    lastFriday -= oneday
>>> print lastFriday
2015-04-17
>>> print lastFriday.strftime('%A-%Y/%m/%d')
Friday-2015/04/17
-----------------第二种方法--------------------------
>>> today = datetime.date(2014,3,5)
>>> this_weekday = today.weekday()
>>> this_weekday
2
>>> delta_weekday = (this_weekday - calendar.FRIDAY) % 7
>>> last_friday = today - datetime.timedelta(days=delta_weekday)
>>> print last_friday
2014-02-28

统计歌曲的总播放时间

def totalTimes(times):
    td = datetime.timedelta(0)
    duration = sum((datetime.timedelta(minutes=m,seconds=s) for m,s in times),td)
    return duration

decimal模块

decimal.Decimal(string or int)–>返回一个decimal对象

一些例子：

>>> from decimal import *
>>> setcontext(ExtendedContext)
>>> Decimal(0)
Decimal('0')
>>> Decimal('1')
Decimal('1')
>>> Decimal('-.0123')
Decimal('-0.0123')
>>> Decimal(123456)
Decimal('123456')
>>> Decimal('123.45e12345678901234567890')
Decimal('1.2345E+12345678901234567892')
>>> Decimal('1.33') + Decimal('1.27')
Decimal('2.60')
>>> Decimal('12.34') + Decimal('3.87') - Decimal('18.41')
Decimal('-2.20')
>>> dig = Decimal(1)
>>> print dig / Decimal(3)
0.333333333
>>> getcontext().prec = 18
>>> print dig / Decimal(3)
0.333333333333333333
>>> print dig.sqrt()
1
>>> print Decimal(3).sqrt()
1.73205080756887729
>>> print Decimal(3) ** 123
4.85192780976896427E+58
>>> inf = Decimal(1) / Decimal(0)
>>> print inf
Infinity
>>> neginf = Decimal(-1) / Decimal(0)
>>> print neginf
-Infinity
>>> print neginf + inf
NaN
>>> print neginf * inf
-Infinity
>>> print dig / 0
Infinity
>>> getcontext().traps[DivisionByZero] = 1
>>> print dig / 0
Traceback (most recent call last):
  ...
  ...
  ...
DivisionByZero: x / 0
>>> c = Context()
>>> c.traps[InvalidOperation] = 0
>>> print c.flags[InvalidOperation]
0
>>> c.divide(Decimal(0), Decimal(0))
Decimal('NaN')
>>> c.traps[InvalidOperation] = 1
>>> print c.flags[InvalidOperation]
1
>>> c.flags[InvalidOperation] = 0
>>> print c.flags[InvalidOperation]
0
>>> print c.divide(Decimal(0), Decimal(0))
Traceback (most recent call last):
  ...
  ...
  ...
InvalidOperation: 0 / 0
>>> print c.flags[InvalidOperation]
1
>>> c.flags[InvalidOperation] = 0
>>> c.traps[InvalidOperation] = 0
>>> print c.divide(Decimal(0), Decimal(0))
NaN
>>> print c.flags[InvalidOperation]
1

python技巧

1.对象拷贝

python中给变量赋值实际上是变量对值（对象）的引用，当通过一个变量修改了值（对象），也会影响到其它引用了相同值（对象）的变量。例如：

>>> a = [1, 2, 3, 4]
>>> b = a
>>> b[2] = 4
>>> a,b
([1, 2, 4, 4], [1, 2, 4, 4])

a,b都引用了同一个列表值，当变量b修改了列表值，因为变量a引用同一个列表，所以变量a会受到影响。
如果想要拷贝一个值（对象），可以考虑使用copy模块。

copy.copy() 浅拷贝
copy.deepcopy() 深拷贝

例子：

>>> a = [[1,2,3],[12,32],[21,32,45]]
>>> import copy
>>> b = copy.copy(a)
>>> b
[[1, 2, 3], [12, 32], [21, 32, 45]]
>>> b[0] = [2]
>>> a,b
([[1, 2, 3], [12, 32], [21, 32, 45]], [[2], [12, 32], [21, 32, 45]])
>>> b[1][0] = 23
>>> a,b
([[1, 2, 3], [23, 32], [21, 32, 45]], [[2], [23, 32], [21, 32, 45]])

注意到当执行b[1][0] = 23后，变量a也受到了影响。这是因为copy.copy()只是浅层拷贝，对于嵌套的值而言，还是简单的引用。要想嵌套的拷贝一个值（对象），应该使用copy.deepcopy()。

另外实现拷贝的方法还有

list

>>> a = [1,2,3,4]       
>>> b = list(a)
>>> a,b
([1, 2, 3, 4], [1, 2, 3, 4])
>>> a[0] = 2
>>> a,b
([2, 2, 3, 4], [1, 2, 3, 4])

dict
set
L[:] = LL[:]

注意：

>>> l = [[]] * 10
>>> l
[[], [], [], [], [], [], [], [], [], []]
>>> l[0].append(2)
>>> l
[[2], [2], [2], [2], [2], [2], [2], [2], [2], [2]]

之所以会出现执行l[0].append(2)后，所有嵌套列表都追加了2的情况，是因为[[]] * 10是浅层拷贝，每个嵌套列表实际上都是引用同一个值，修改任意一个嵌套列表都会影响其它的嵌套列表。

2.善用列表推导（[x for x in L]）和生成器表达式（(x for x in L)）

列表推导是一次生成所有值，而生成器表达式则是每次生成一个值，节省内存。

3.根据index返回列表L[index]的值，如果index超过列表范围，则返回给定的某一个值。

def list_get(L, index, v=None):
    if -len(L) <= index < len(L):
        return L[index]
    else:
        return v
# or

def list_get(L, index, v=None):
    try:
        return L[index]
    except IndexError:
        return v

第一个函数相对第二个来所，效率更快。

4.循环访问序列中的元素和索引

#推荐做法
for index, item in enumerate(seq):
    process(index)
    process(item)
#而不是
for i in range(len(seq)):
    process(seq[i])

5.展开一个嵌套的序列

#递归版本
def list_or_tuple(seq):
    return True if isinstance(seq,(list,tuple)) else False

def nonstring_iterable(seq):
    try:
        iter(seq)
    except TypeError:
        return False
    else:
        return not isinstance(seq,basestring)

def flatten(seq, to_expand=list_or_tuple):
    for item in seq:
        if to_expand(item):
            for item in flatten(item, to_expand=list_or_tuple):
                yield item
        else:
            yield item

#非递归版本
def flatten(seq, to_expand=list_or_tuple):
    iterators = [iter(seq)]  #将seq转换为迭代器对象，可以保存遍历的状态
    while iterators:
        for item in iterators[-1]:
            if to_expand(item):
                iterators.append(iter(item))
                break
            else:
                yield item
        else:
            iterators.pop()

6.将二维列表的列变为行

#列表推导
[[row[col] for row in matrix] for col in range(len(matrix[0]))]
#map，zip
map(list, zip(*matrix))

7.字典方法

get(key, val=None)
如果key存在，则返回D[key]；否则返回val。
setdefault(key, val)
如果key存在，则直接返回D[key]；否则执行D[key]=val，然后返回D[key]。
创建新字典
- dict(**kwargs)
- dict(zip(key_seq, val_seq))
- dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v. v defaults to None.
update(…)
D.update([E, ]**F) -> None. Update D from dict/iterable E and F.
If E present and has a .keys() method, does: for k in E: D[k] = E[k]
If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v
In either case, this is followed by: for k in F: D[k] = F[k]

8.将列表元素交替的作为键值对创建字典

#第一种方法
dict(zip(keys_vals[::2], keys_vals[1::2]))
#第二种方法
def pairwise(keys_vals):
    next = iter(keys_vals).next
    while True:
        yield next(), next()
def dictFromSeq(seq):
    return dict(pairwise(seq))

搜索和排序

python中排序方法称为DSU（Decorate-sort-Undecorate）

1.给字典排序

key_vals = [(k, v) for k, v in adict.items()] #装饰
key_vals.sort() #排序

2.cmp内置函数

cmp可以比较两个单一元素（int,float），也可以比较可迭代对象。比较过程的代码类似于：

while i < len(a) and i < len(b):
    res = cmp(a[i], b[i])
    if res:
        return res
    i += 1
return cmp(len(a), len(b))

3.模块heapq

优先队列
heapq.heappop()
heapq.heappush()
可用Queue.PriorityQueue代替

4.获取列表中最小的几个元素

简单粗暴的方法
```
alist.sort()
alist[:n]
```

使用heapq模块

如果事先知道要获取前n个元素，那么可以直接使用nsmallest方法
heapq.nsmallest(n, alist)
如果事先不知道，可以：

def isorted(data):
    data = list(data) #这里不仅将data转换为list，而且还获得data的一份拷贝
    heapq.heapify(data) #将data初始化为heap
    while data:
        yield heapq.heappop(data)

5. 二分查找

模块bisect
bisect.bisect(a, x[, lo[, hi]])–>index

Return the index where to insert item x in list a, assuming a is sorted.
The return value i is such that all e in a[:i] have e <= x, and all e in a[i:] have e > x. So if x already appears in the list, i points just
beyond the rightmost x already there
具体信息可查看python doc

6.获取列表中第n个元素

当列表长度较小，且元素易于比较（如int, float，string）时，可以采用下面这种先排序再取值的方法

data.sort()
data[n]

当列表长度非常大，而且元素比较开销较大，那么上面这种先排序再取值的方法效率偏低。

def select(data, n):
    data = list(data)
    pivot_count = 0
    under = []
    over = []
    uappend = under.append #优化技巧
    oappend = over.append
    while True:
        pivot = random.choice(data)
        for item in data:
            if item < pivot:
                uappend(item)
            elif item > pivot:
                oappend(item)
            else:
                pivot_count += 1
        if n < len(under):
            data = under
        elif n < len(under) + pivot_count:
            return pivot
        else:
            data = over
            n -= len(under) + pivot_count

7.查找子串

def finditer(text, pattern):
pos = -1
while True:
pos = text.find(pattern, pos + 1)
if pos < 0: break
yield pos

8.

l_xm

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[pyhton]python cookbook学习笔记

测试一个对象是否是类字符串isinstance type(obj) == type(”)try: obj + ''except: return Falseelse: return True字符串对齐string.ljust(width,fill)string.rjust(width,fill)string.center(width,fill)去除字符串中包含的
复制链接

扫一扫

专栏目录