一些python常见的小技巧

最新推荐文章于 2023-12-09 08:58:48 发布

天青如水

最新推荐文章于 2023-12-09 08:58:48 发布

阅读量273

点赞数

分类专栏： python 文章标签： python 技巧

本文链接：https://blog.csdn.net/qq_16829085/article/details/101569526

版权

python 专栏收录该内容

26 篇文章 0 订阅

订阅专栏

一些python常见的小技巧

重复元素判定
使用set（）函数来去除重复元素

 def all_unique(lt):
    return len(lt) == len(set(lt))
 a = [1,2,3,4,5,6,7,8,9,8,7,6,5]
 b = [1,2,3,4,5,6,7,8,9]
 all_unique(a) # False
 all_unique(b) # True

字符元素组成判定
检查两个字符串的组成元素是不是一样的

 from collections import Counter

 def anagram(first,second):
    return Counter(first) == Counter(second)
 anagram("abc3def","def3acb") # True

内存占用
检查变量variable所占用的内存

 import sys

 variable = 30
 print(sys.getsizeof(variable)) # 14

字节占用
检查字符串所占用的字节数

 def byte_size(string):
    return len(string.encode('utf-8'))
 byte_size('hello world')

打印N次字符串
```
n = 2
s = "Programming"
print(s * n)
```

大写第一个字母

s = "hello world"
print(s.title()) # Hello World

分块
给定具体的大小,定义一个函数以按照这个大小切割列表

from math import ceil

def chunk(lst, size):
    return list(
        map(lambda x: lst[x * size:x * size + size],
            list(range(0, ceil(len(lst) / size)))))

chunk([1,2,3,4,5],2) # [[1,2],[3,4],5]

压缩
将布尔型的值去掉

def compact(lst):
    return list(filter(bool, lst))
compact([0, 1, False, 2, '', 3, 'a', 's', 34])
# [ 1, 2, 3, 'a', 's', 34 ]

解包
将打包好的成对列表解开成两组不同的元组

array = [['a', 'b'], ['c', 'd'], ['e', 'f']]
transposed = zip(*array)
print(transposed)
# [('a', 'c', 'e'), ('b', 'd', 'f')]

链式对比
在一行代码中使用不同的运算符对比多个不同的元素
```
a = 3
print( 2 < a < 8) # True
print(1 == a < 2) # False 
```
逗号链接
将列表连接成单个字符串，且每一个元素间的分隔方式设置为了逗号

```python
a = 3
print( 2 < a < 8) # True
print(1 == a < 2) # False 
```

元音统计
统计字符串中的元音 (‘a’, ‘e’, ‘i’, ‘o’, ‘u’) 的个数，通过正则表达式做的

```python
import re

def count_vowels(str):
    return len(len(re.findall(r'[aeiou]', str, re.IGNORECASE)))

count_vowels('foobar') # 3
count_vowels('gym') # 0
```

首字母小写
令给定字符串的第一个字符统一为小写

def decapitalize(string):
    return str[:1].lower() + str[1:]

decapitalize('FooBar') # 'fooBar'

展开列表
通过递归的方式将列表的嵌套展开为单个列表

def spread(arg):
    ret = []
    for i in arg:
        if isinstance(i, list):
            ret.extend(i)
        else:
            ret.append(i)
    return ret

def deep_flatten(lst):
    result = []
    result.extend(
        spread(list(map(lambda x: deep_flatten(x) if type(x) == list else x, lst))))
    return result

deep_flatten([1, [2], [[3], 4], 5]) # [1,2,3,4,5]

列表的差
该方法将返回第一个列表的元素，其不在第二个列表内。如果同时要反馈第二个列表独有的元素，还需要加一句 set_b.difference(set_a)

def difference(a, b):
    set_a = set(a)
    set_b = set(b)
    comparison = set_a.difference(set_b)
    return list(comparison)
difference([1,2,3], [1,2,4]) # [3]

函数去差
如下方法首先会应用一个给定的函数，然后再返回应用函数后结果有差别的列表元素

def difference_by(a, b, fn):
b = set(map(fn, b))
return [item for item in a if fn(item) not in b]

from math import floor
difference_by([2.1, 1.2], [2.3, 3.4],floor) # [1.2]
difference_by([{ 'x': 2 }, { 'x': 1 }], [{ 'x': 1 }], lambda v : v['x'])
# [ { x: 2 } ]

链式函数调用
一行代码内调用多个函数

def add(a, b):
    return a + b

def subtract(a, b):
    return a - b

a, b = 4, 5
print((subtract if a > b else add)(a, b)) # 9

合并字典

def merge_dictionaries(a, b)
return {**a, **b}


a = { 'x': 1, 'y': 2}
b = { 'y': 3, 'z': 4}
print(merge_dictionaries(a, b))
# {'y': 3, 'x': 1, 'z': 4}

将两个列表转化为字典

def to_dictionary(keys, values):
    return dict(zip(keys, values))


keys = ["a", "b", "c"]    
values = [2, 3, 4]
print(to_dictionary(keys, values))
# {'a': 2, 'c': 4, 'b': 3}

执行时间

import time
start_time = time.time()
a = 1
b = 2
c = a + b
print(c)
end_time = time.time()
total_time = end_time - start_time
print("Time: ", total_time)
# ('Time: ', 0.00049591064453125)

元素频率

def most_frequent(list):
return max(set(list), key = list.count)

list = [1,2,1,2,3,2,1,4,2]
most_frequent(list)

回文序列
它首先会把所有字母转化为小写，并移除非英文字母符号。最后，它会对比字符串与反向字符串是否相等，相等则表示为回文序列
```
def palindrome(string):
    from re import sub
    s = sub('[W_]', '', string.lower())
    return s == s[::-1]
palindrome('tacocat') # True    
```

不使用 if-else 的计算子

import operator
action = {
    "+": operator.add,
    "-": operator.sub,
    "/": operator.truediv,
    "*": operator.mul,
    "**": pow
}
print(action['-'](50, 25)) # 25

Shuffle
该算法会打乱列表元素的顺序，它主要会通过 Fisher-Yates 算法对新列表进行排序

from copy import deepcopy
from random import randint

def shuffle(lst):
    temp_lst = deepcopy(lst)
    m = len(temp_lst)
    while (m):
        m -= 1
        i = randint(0, m)
        temp_lst[m], temp_lst[i] = temp_lst[i], temp_lst[m]
    return temp_lst

foo = [1,2,3]
shuffle(foo) # [2,3,1] , foo = [1,2,3]

字典默认值
通过 Key 取对应的 Value 值，可以通过以下方式设置默认值。如果 get() 方法没有设置默认值，那么如果遇到不存在的 Key，则会返回 None。
```
    d = {'a': 1, 'b': 2}
    print(d.get('c', 3)) # 3    
```
惰性计算
python在某些时候，仅仅会在需要执行的时候才计算表达式的值，例如False and xx，True or xx。通过惰性计算，可以避免不必要的计算，带来性能上的提升。

from time import time

start = time()
abbreviations = ['cf.', 'e.g.', 'ex.', 'etc.', 'fig.', 'i.e.', 'Mr.', 'vs.']
for i in range (1000000):
    for w in ('Mr.', 'Hat', 'is', 'chasing', 'the', 'black', 'cat', '.'):
        # if w in abbreviations: # 1.057518720626831
        if w[-1] == '.' and w in abbreviations: # 0.9202585220336914
            pass
print("运行时间:", time()-start)

精度问题
在计算机中，浮点数的存储规则决定了不是所有的浮点数都能准确表示，有些是不准确的，只是无限接近。设计到浮点数的问题时，可以使用decimal模块或者将浮点数转化为整数计算后，再转回浮点数。

# 无线循环代码
i = 1
count = 0
while i!=1.5:
    i +=0.1
    print(i)
    count +=1
    if count >10:
        break
# 输出
>>>
1.1
1.2000000000000002
1.3000000000000003
1.4000000000000004
1.5000000000000004
1.6000000000000005
1.7000000000000006
1.8000000000000007
1.9000000000000008
2.000000000000001
2.100000000000001

使用decimal模块

from decimal import *

# 设定小数点精度
d_context = getcontext()
d_context.prec = 6
print(d_context)

i = Decimal('1.0')
count = 0
while i!=1.5:
    i += Decimal(0.1)
    print(i)

# 输出
>>>
Context(prec=6, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow])
1.10000
1.20000
1.30000
1.40000
1.50000

使用join连接字符串
大规模连接字符串时，使用join，效率更高。使用+会一直开辟新的内存。

import timeit

# 生成测试所需要的字符数组
strlist=["it is a long value string will not keep in memory" for n in range(100000)]
# 10000 为字符串连接的数目，下面对应的测试数据，每次需要修改

def join_test():
    return ''.join(strlist)

def plus_test():
    result = ''
    for i,v in enumerate(strlist):
        result =result+v
    
    return result

if __name__ == '__main__':
    jointimer = timeit.Timer("join_test()","from __main__ import join_test")
    print(jointimer.timeit(number = 1000)) # 3.691069600004994

    plustimer = timeit.Timer("plus_test()","from __main__ import plus_test")
    print(plustimer.timeit(number = 1000)) # 848.6195682999969

不要将可变对象作为函数默认参数
python中,默认参数在函数被调用的时候仅仅被初始化一次，以后都会使用第一次的结果。

def add(value,sequence = []):
    sequence.append(value)
    return sequence

sequence1 = add(5)
print(sequence1)
sequence2 = add(6)
print(sequence2)

print(id(sequence1), id(sequence2))
# 输出
>>>
[5]
[5, 6]
2101029152648 2101029152648