学习更多,欢迎关注微信公众号:Excel办公小技巧
最近在系统学习pandas用法,遇到了broadcasting机制,在《利用Python进行数据分析》一书中,直接翻译成了广播,查了下资料,整理下自己的理解
参考链接:https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes. Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python. It does this without making needless copies of data and usually leads to efficient algorithm implementations. There are, however, cases where broadcasting is a bad idea because it leads to inefficient use of memory that slows computation.
1、相同形状
>>> a = np.array([1.0, 2.0, 3.0])
>>> b = np.array([2.0, 2.0, 2.0])
>>> a * b
array([ 2., 4., 6.])
2、一个数组和一个标量值
>>> a = np.array([1.0, 2.0, 3.0])
>>> b = 2.0
>>> a * b
array([ 2., 4., 6.])
上面两种情况得到的结果相同,我们认为是b计算中作了延展,以达到和a一样的形状,实际上NumPy is smart enough to use the original scalar value without actually making copies,所以看起来broadcasting机制更高效。
broadcasting规则
满足矩阵运算规则,从后向前匹配,维度相同或存在其中一个维度是1则计算