Python积累记录

置顶 hellowworld-404-qwq

已于 2023-03-21 15:29:40 修改

阅读量627

点赞数

分类专栏：零散知识 knowledge in pieces 文章标签： python 开发语言

于 2023-02-08 19:37:18 首次发布

本文链接：https://blog.csdn.net/qq_41719597/article/details/128942169

版权

零散知识 knowledge in pieces 专栏收录该内容

22 篇文章 0 订阅

订阅专栏

Python的学习曲线是“__｜”类型，后半的关键我认为在于对于Python的理解（参考《Fluent Python》）、标准库、以及对于各种场景的积累。已经做了不少关于Python的记录，这里便收集一下。

Numpy包的学习摘录
机器学习、tensoflow和keras
Python 的面向对象系统，即object 和 type 关键字的小结
- 概念上的理解
- 手册中的查阅
Python 其它构造函数的「标准」写法
使用timeit比较代码片段的速度

Numpy包的学习摘录

消化一下官网的numpy教程，需要提高一些掌握的程度。

参考内容：

构建np.array的方法

从list, np.array
指定shape构建——
- zeros, ones, empty
- zeros_like，ones_like, empty
- full, 使用指定数构建
- 随机数, np.random.random等
从指定空间采样
- np.arange 构建有序数组
- linespace
- logspace
fromfuntion指定数值构建方法
fromfile
特殊数组
- np.identity 单位矩阵
- np.eye

数组的索引方法

slice 索引: X[start:step:end, next_slice, ...]
elipsis 缺省索引：X[i, ...] 取其余 axes 的所有元素
数值索引：X[(idx1,idx2), ...] 取axis_0 为 idx1 或 idx2 的所有元素
boolean 索引 X[X>0]

np.array的数组操作

使用`np.newaxis`增加数组维度

`reshape` 更改数组形状

`np.view` 做shallow copy, `np.copy`做deepcopy

连接数组

hstack,vstack，在x，y轴进行stack
column_stack将一维数组堆叠为二维
stack与上面不同，在新维度堆叠
concatenate要求其他维度必须一致

切分数组

slicing 通过indices抽取子数组，与take一致
np.choice https://numpy.org/doc/stable/reference/generated/numpy.choose.html#numpy.choose
np.compress 依据条件抽取slices
split类别 https://numpy.org/doc/stable/user/quickstart.html

`np.transpose` 对于 axes 的重排列

对于二维数组相当于矩阵转置
对于高维数组，可以看作X[a,b,...]与X[b,a,...]的位置交换

`np.swapaxis` 是上一条的简化版本

算术与统计函数

基本算数操作

参考Broadcasting rules，经过boradcast后，等同于相同shape的数组之间的运算。

Unary函数

axis rules允许灵活地对数组内容进行reduce。

all, any 逻辑函数
max, min, mean, sum, ptp, std, var 统计函数
prod, arr.T
argmax, argmin, argsort

Binary函数

dot, inner

Procedure

sort，flip, flatten

`np.nonzero`作数组查询

Broadcasting rules

If all input arrays do not have the same number of dimensions, a“1” will be repeatedly prepended to the shapes of the smaller arrays until all the arrays have the same number of dimensions.
Arrays with a size of 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension. The value of the array element is assumed to be the same along that dimension for the “broadcast” array.
After 1 and 2，all array must be the same shape.

numpy 函数中axis参数

axis，数轴，是array索引空间的中相互独立的子集

对于某一函数的axis参数，例如np.sum(A, axis=0)，可以有两种理解:

# 1. 从axis_0取元素求和
sum(A[i,...] for i in range(A.shape[0]))
# 2. 将A看作n-1维的数据，元素为axis_0上的数据，执行reduce
np.array(
  sum(A[:,index]  for index in  itertools.product(
    range(len_i)  for len_i in A.shape[1:]
  ))
).reshape(A.shape[1:])

不过前者还是更简单点

机器学习、tensoflow和keras

参考《Hands-On Machine Learning with Scikit-Learn, Keras and Tensorflow》

Python 的面向对象系统，即`object` 和 `type` 关键字的小结

Objects are Python’s abstraction for data.

概念上的理解

python 中的所有实例都具有类型(type), 所有实例都是object类型的实例

# 1. 所有实例都具有类型
....
try:
    type(whatever)
except NameError:
    pass
else:
    assert False

# 2.所有实例都是object的实例
assert isinstance(whatever, object) == True

type本身也具有类型<class 'type'>，且type是object的实例

assert type(type) == type
assert isinstance(type, object)

这基本上表示了<class 'type'>是类型系统的顶点。即

assert isinstance(type(whatever), type)
assert isinstance(whatever.__class__, type)

对于<class 'type'>来说，这是显然的事实。此外，可以验证

assert type(object) == type
assert isinstance(object, type)

  `type`与`object`的关系基本如此

<class 'type'>属于metaclass。meta在代码中的含义可以理解为产生代码的代码，metaclass即为产生class的class。如下
```
assert type("abc") == str
```
后续学习python metaprogramming部分可能会补充

class X的默认类型为<class 'type'>, def foo的默认类型为<class 'function'>

手册中的查阅

class type(object)：属于构造函数,基本上等价于返回object.__class__
isinstance(object, classinfo):判断object.__class__与classinfo的关系。但注意**不是与classinfo.__class__**的关系。

Python 其它构造函数的「标准」写法

使用@classmethod装饰器。如下

class X:
    @classmethod
    def from_dict(cls, dct):
        return cls(
            **{
                key: (dct[key] if val.default == val.empty else data.get(key, val.default))
                    for key, val in inspect.signature(cls).parameters.items()
            }
        )

使用timeit比较代码片段的速度

timeit的链接，参考[比较flaten的各种方法](https://stackoverflow.com/questions/952914/how-do-i-make-a-flat-list-out-of-a-list-of-lists
python)，基本方式是重复执行代码，计算平均和最好的速度。

$ python -m timeit -s'import random; random.seed(42); l=[ [random.randint(1, 100) for _ in range(10)] for _ in range(100)]' '[item for sublist in l for item in sublist]'
10000 loops, best of 5: 21.7 usec per loop
$ python -m timeit -s'import random; random.seed(42); l=[ [random.randint(1, 100) for _ in range(10)] for _ in range(100)]' 'sum(l, [])'                                 
5000 loops, best of 5: 78.5 usec per loop
$ python -m timeit -s'import random; random.seed(42); l=[ [random.randint(1, 100) for _ in range(10)] for _ in range(100)]; from functools import reduce' 'reduce(lambda x,y: x+y, l)'
5000 loops, best of 5: 88.3 usec per loop