更多干货
Broadcasting(广播) 解决的是不同形状的矩阵(或者向量)之间的运算问题。
在代数运算中,不同形状的矩阵(或者向量)之间无法进行基本运算,但是在Numpy中,只要满足一般规则,这个运算的允许的。
In [ ]:
import numpy as np
In [ ]:
a = np.array([1, 2, 3])
b = 2
a + b
In [ ]:
A = np.array([[1, 2, 3],
[1, 2, 3]])
b = 2
C = A + b
C
In [ ]:
A = np.array([[1, 2, 3],
[1, 2, 3]])
b = np.array([1, 2, 3])
C = A + b
C
Broadcasting的基本原则
整体而言,两个不同形状的矩阵(或者向量)进行基本运算,看两个矩阵(或者向量)的倒序维数。如果倒序维数是一致的,则“小矩阵”经过复制扩展,和“大矩阵”进行基本运算。
比如:
A.shape = (2 x 3) -> A.shape = (2 x 3)
b.shape = (3) -> b.shape = (1 x 3)
A.shape = (2 x 3) -> A.shape = (2 x 3)
b.shape = (1) -> b.shape = (1 x 1)
但是,在以下例子中,b无法broadcasting后和A进行运算
A.shape = (2 x 3)
b.shape = (1 x 2)
In [ ]:
A = np.array([[1, 2, 3],
[1, 2, 3]])
b = np.array([1, 2])
C = A + b
C
In [ ]:
name = ['Alice', 'Bob', 'Cathy', 'Doug']
age = [25, 45, 37, 19]
weight = [55.0, 85.5, 68.0, 61.5]
创建无结构化的数据
In [ ]:
import numpy as np
x = np.zeros(4, dtype=np.int)
x
创建结构化数据,dtype是一个字典,包括names和formats
In [ ]:
import numpy as np
data = np.zeros(4, dtype={'names':('name', 'age', 'weight'),
'formats':('U10', 'i4', 'f8')})
print(data.dtype)
In [ ]:
data['name'] = name
data['age'] = age
data['weight'] = weight
print(data)
In [ ]:
data['name']
In [ ]:
data['age'].mean()
In [ ]:
data['weight'].max()
In [ ]:
data[data['age'] < 30]['name']
In [ ]:
data2 = np.zeros(4, dtype=[('name', 'U10'), ('age', 'i4'), ('weight', 'f8')])
print(data2.dtype)
In [ ]:
data2['name'] = name
data2['age'] = age
data2['weight'] = weight
print(data2)
和structured array一样,但是属性(列向量)可以直接使用[.属性名]的方式访问
In [ ]:
data_rec = data.view(np.recarray)
data_rec.age
方便,但效率稍低
In [ ]:
%timeit data['age']
%timeit data_rec['age']
%timeit data_rec.age
In [ ]:
from sklearn import datasets
datasets
中的数据集:http://scikit-learn.org/stable/datasets/index.html
In [ ]:
from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original')
In [ ]:
mnist.data.shape
In [ ]:
mnist.target.shape
In [ ]:
mnist.data[:10]
In [ ]:
mnist.data[0]
In [ ]:
import matplotlib
import matplotlib.pyplot as plt
digit = mnist.data[0]
digit_image = digit.reshape(28, 28)
plt.imshow(digit_image, cmap=matplotlib.cm.binary)
plt.axis("off")
plt.show()
In [ ]:
mnist.target[0]