py工匠その一

最新推荐文章于 2022-03-23 11:25:00 发布

苟修今

最新推荐文章于 2022-03-23 11:25:00 发布

阅读量154

点赞数

本文链接：https://blog.csdn.net/deskyaki/article/details/117308185

版权

写在前面：现在打工愈发熟练，没有刚开始不知道干啥的慌乱感觉，取而代之的是要怎么做的蒙蔽感觉。

偶然看到了这个github项目pyhton 工匠，有的点确实让眼前一亮，牛逼啊。

原文：https://github.com/piglei/one-python-craftsman

留意不同分支的重复代码

before

# 对于新用户，创建新的用户资料，否则更新旧资料
if user.no_profile_exists:
    create_user_profile(
        username=user.username,
        email=user.email,
        age=user.age,
        address=user.address,
        # 对于新建用户，将用户的积分置为 0
        points=0,
        created=now(),
    )
else:
    update_user_profile(
        username=user.username,
        email=user.email,
        age=user.age,
        address=user.address,
        updated=now(),
    )

after

if user.no_profile_exists:
    profile_func = create_user_profile
    extra_args = {'points': 0, 'created': now()}
else:
    profile_func = update_user_profile
    extra_args = {'updated': now()}

profile_func(
    username=user.username,
    email=user.email,
    age=user.age,
    address=user.address,
    **extra_args
)

使用“德摩根定律”

before

# 如果用户没有登录或者用户没有使用 chrome，拒绝提供服务
if not user.has_logged_in or not user.is_from_chrome:
    return "our service is only available for chrome logged in user"

这个时候，就该德摩根定律出场了。通俗的说，德摩根定律就是 not A or not B 等价于 not (A and B)。通过这样的转换，上面的代码可以

after

if not (user.has_logged_in and user.is_from_chrome):
    return "our service is only available for chrome logged in user"

在条件判断中使用 all() / any()

all() 和 any() 两个函数非常适合在条件判断中使用。这两个函数接受一个可迭代对象，返回一个布尔值，其中：

all(seq)：仅当 seq 中所有对象都为布尔真时返回 True，否则返回 False
any(seq)：只要 seq 中任何一个对象为布尔真就返回 True，否则返回 False

before

def all_numbers_gt_10(numbers):
    """仅当序列中所有数字大于 10 时，返回 True
    """
    if not numbers:
        return False

    for n in numbers:
        if n <= 10:
            return False
    return True

after

直接bool这个操作从来没用过，，，

def all_numbers_gt_10_2(numbers):
    return bool(numbers) and all(n > 10 for n in numbers)

使用 try/while/for 中 else 分支

before

ef do_stuff():
    first_thing_successed = False
    try:
        do_the_first_thing()
        first_thing_successed = True
    except Exception as e:
        print("Error while calling do_some_thing")
        return

    # 仅当 first_thing 成功完成时，做第二件事
    if first_thing_successed:
        return do_the_second_thing()

after

def do_stuff():
    try:
        do_the_first_thing()
    except Exception as e:
        print("Error while calling do_some_thing")
        return
    else:
        return do_the_second_thing()

长见识了！！

在 try 语句块最后追加上 else 分支后，分支下的do_the_second_thing() 便只会在 try 下面的所有语句正常执行（也就是没有异常，没有 return、break 等）完成后执行。

类似的，Python 里的 for/while 循环也支持添加 else 分支，它们表示：当循环使用的迭代对象被正常耗尽、或 while 循环使用的条件变量变为 False 后才执行 else 分支下的代码。

少写数字字面量

就是用个类定义数字变量

打印不出变量但是能判断

# -*- coding: utf-8 -*-
from enum import IntEnum

class TripSource(IntEnum):
    FROM_WEBSITE = 11
    FROM_IOS_CLIENT = 12


def mark_trip_as_featured(trip):
    if trip.source == TripSource.FROM_WEBSITE:
        do_some_thing(trip)
    elif trip.source == TripSource.FROM_IOS_CLIENT:
        do_some_other_thing(trip)
    ... ...
    return

使用“无穷大” float("inf")

float("inf") 和 float("-inf")”。它们俩分别对应着数学世界里的正负无穷大。当它们和任意数值进行比较时，满足这样的规律：float("-inf") < 任意数值 < float("inf")。

因为它们有着这样的特点，我们可以在某些场景用上它们：

# A. 根据年龄升序排序，没有提供年龄放在最后边
>>> users = {"tom": 19, "jenny": 13, "jack": None, "andrew": 43}
>>> sorted(users.keys(), key=lambda user: users.get(user) or float('inf'))
['jenny', 'tom', 'andrew', 'jack']

# B. 作为循环初始值，简化第一次判断逻辑
>>> max_num = float('-inf')
>>> # 找到列表中最大的数字
>>> for i in [23, 71, 3, 21, 8]:
...:    if i > max_num:
...:         max_num = i
...:
>>> max_num
71

使用元组改善分支代码

before

import time


def from_now(ts):
    """接收一个过去的时间戳，返回距离当前时间的相对时间文字描述
    """
    now = time.time()
    seconds_delta = int(now - ts)
    if seconds_delta < 1:
        return "less than 1 second ago"
    elif seconds_delta < 60:
        return "{} seconds ago".format(seconds_delta)
    elif seconds_delta < 3600:
        return "{} minutes ago".format(seconds_delta // 60)
    elif seconds_delta < 3600 * 24:
        return "{} hours ago".format(seconds_delta // 3600)
    else:
        return "{} days ago".format(seconds_delta // (3600 * 24))


now = time.time()
print(from_now(now))
print(from_now(now - 24))
print(from_now(now - 600))
print(from_now(now - 7500))
print(from_now(now - 87500))
# OUTPUT:
# less than 1 second ago
# 24 seconds ago
# 10 minutes ago
# 2 hours ago
# 1 days ago

after

import bisect


# BREAKPOINTS 必须是已经排好序的，不然无法进行二分查找
BREAKPOINTS = (1, 60, 3600, 3600 * 24)
TMPLS = (
    # unit, template
    (1, "less than 1 second ago"),
    (1, "{units} seconds ago"),
    (60, "{units} minutes ago"),
    (3600, "{units} hours ago"),
    (3600 * 24, "{units} days ago"),
)


def from_now(ts):
    """接收一个过去的时间戳，返回距离当前时间的相对时间文字描述
    """
    seconds_delta = int(time.time() - ts)
    unit, tmpl = TMPLS[bisect.bisect(BREAKPOINTS, seconds_delta)]
    return tmpl.format(units=seconds_delta // unit)

使用 partial 构造新函数

比方在这个例子里， double 函数就是完全通过 multiply 来完成计算的：

def multiply(x, y):
    return x * y


def double(value):
    # 返回另一个函数调用结果
    return multiply(2, value)

import functools

double = functools.partial(multiply, 2)

使用函数修饰被迭代对象来优化循

before

def find_twelve(num_list1, num_list2, num_list3):
    """从 3 个数字列表中，寻找是否存在和为 12 的 3 个数
    """
    for num1 in num_list1:
        for num2 in num_list2:
            for num3 in num_list3:
                if num1 + num2 + num3 == 12:
                    return num1, num2, num3

after

from itertools import product


def find_twelve_v2(num_list1, num_list2, num_list3):
    for num1, num2, num3 in product(num_list1, num_list2, num_list3):
        if num1 + num2 + num3 == 12:
            return num1, num2, num3

使用 islice 实现循环内隔行处理

python-guide: Python best practices guidebook, written for humans.

---

Python 2 Death Clock --- Run any Python Script with an Alexa Voice Command

---

<... ...>

可能是为了美观，在这份文件里的每两个标题之间，都有一个 "---" 分隔符。现在，我们需要获取文件里所有的标题列表，所以在遍历文件内容的过程中，必须跳过这些无意义的分隔符。

参考之前对 enumerate() 函数的了解，我们可以通过在循环内加一段基于当前循环序号的 if 判断来做到这一点

def parse_titles(filename):
    """从隔行数据文件中读取 reddit 主题名称
    """
    with open(filename, 'r') as fp:
        for i, line in enumerate(fp):
            # 跳过无意义的 '---' 分隔符
            if i % 2 == 0:
                yield line.strip()

但对于这类在循环内进行隔行处理的需求来说，如果使用 itertools 里的 islice() 函数修饰被循环对象，可以让循环体代码变得更简单直接。

islice(seq, start, end, step) 函数和数组切片操作*（ list[start:stop:step] ）有着几乎一模一样的参数。如果需要在循环内部进行隔行处理的话，只要设置第三个递进步长参数 step 值为 2 即可（默认为 1）*。

from itertools import islice

def parse_titles_v2(filename):
    with open(filename, 'r') as fp:
        # 设置 step=2，跳过无意义的 '---' 分隔符
        for line in islice(fp, 0, None, 2):
            yield line.strip()

使用 takewhile 替代 break 语句

before

for user in users:
    # 当第一个不合格的用户出现后，不再进行后面的处理
    if not is_qualified(user):
        break

    # 进行处理 ... ...

after

from itertools import takewhile

for user in takewhile(is_qualified, users):
    # 进行处理 ... ...

尝试用类来实现装饰器

装饰器必须是一个“可被调用（callable）的对象。

import time
import functools


class DelayFunc:
    def __init__(self,  duration, func):
        self.duration = duration
        self.func = func

    def __call__(self, *args, **kwargs):
        print(f'Wait for {self.duration} seconds...')
        time.sleep(self.duration)
        return self.func(*args, **kwargs)

    def eager_call(self, *args, **kwargs):
        print('Call without delay')
        return self.func(*args, **kwargs)


def delay(duration):
    """装饰器：推迟某个函数的执行。同时提供 .eager_call 方法立即执行
    """
    # 此处为了避免定义额外函数，直接使用 functools.partial 帮助构造
    # DelayFunc 实例
    return functools.partial(DelayFunc, duration

@delay(duration=2)
def add(a, b):
    return a + b


# 这次调用将会延迟 2 秒
add(1, 2)
# 这次调用将会立即执行
add.eager_call(1, 2)

使用 pathlib 模块

before

import os
import os.path


def unify_ext_with_os_path(path):
    """统一目录下的 .txt 文件名后缀为 .csv
    """
    for filename in os.listdir(path):
        basename, ext = os.path.splitext(filename)
        if ext == '.txt':
            abs_filepath = os.path.join(path, filename)
            os.rename(abs_filepath, os.path.join(path, f'{basename}.csv'))

after

from pathlib import Path

def unify_ext_with_pathlib(path):
    for fpath in Path(path).glob('*.txt'):
        fpath.rename(fpath.with_suffix('.csv'))

读取文件

def chunked_file_reader(fp, block_size=1024 * 8):
    """生成器函数：分块读取文件内容
    """
    while True:
        chunk = fp.read(block_size)
        # 当文件没有更多内容时，read 调用将会返回空字符串 ''
        if not chunk:
            break
        yield chunk


def count_nine_v3(fname):
    count = 0
    with open(fname) as fp:
        for chunk in chunked_file_reader(fp):
            count += chunk.count('9')
    return count

def chunked_file_reader(file, block_size=1024 * 8):
    """生成器函数：分块读取文件内容，使用 iter 函数
    """
    # 首先使用 partial(fp.read, block_size) 构造一个新的无需参数的函数
    # 循环将不断返回 fp.read(block_size) 调用结果，直到其为 '' 时终止
    for chunk in iter(partial(file.read, block_size), ''):
        yield chunk