numpy 矩阵运算的陷阱

陷阱一：数据结构混乱

array 和 matrix 都可以用来表示多维矩阵

In [98]: a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

In [99]: a
Out[99]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

In [100]: A = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

In [101]: A
Out[101]:
matrix([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

In [102]: a.shape
Out[102]: (3, 3)

In [103]: A.shape
Out[103]: (3, 3)

In [99]: a
Out[99]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

In [100]: y
Out[100]:
matrix([[1],
[0],
[1]])

In [101]: a[:, 0]
Out[101]: array([1, 4, 7])

In [102]: a[:, 0].shape
Out[102]: (3,)

In [110]: a[:, 0][y == 1]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-110-f32ed63aa2a8> in <module>()
----> 1 a[:, 0][y == 1]

IndexError: too many indices for array

In [111]: a[:, 0].reshape(3, 1)[y == 1]
Out[111]: array([1, 7])

In [101]: A
Out[101]:
matrix([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

In [112]: y
Out[112]:
matrix([[1],
[0],
[1]])

In [113]: A[:,0]
Out[113]:
matrix([[1],
[4],
[7]])

In [102]: A[:, 0].shape
Out[102]: (3,1)

In [114]: A[:,0][y == 1]
Out[114]: matrix([[1, 7]])

In [114]: A[:,0][y == 1].shape
Out[114]: (1,2)

陷阱二：数据处理能力不足，语言效率低

In [79]: X
Out[79]:
matrix([[ 34.62365962,  78.02469282],
[ 30.28671077,  43.89499752],
[ 35.84740877,  72.90219803],
[ 60.18259939,  86.3085521 ],
[ 79.03273605,  75.34437644]])

In [80]: Y
Out[80]:
matrix([[ True],
[False],
[ True],
[ True],
[False]], dtype=bool)

In [81]: X[Y == True]
Out[81]: matrix([[ 34.62365962,  35.84740877,  60.18259939]])

In [85]: X[Y == True, :]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-85-2aeabbc2bcc5> in <module>()
----> 1 X[Y == True, :]

C:\Python27\lib\site-packages\numpy\matrixlib\defmatrix.pyc in __getitem__(self, index)
314
315         try:
--> 316             out = N.ndarray.__getitem__(self, index)
317         finally:
318             self._getitem = False

IndexError: too many indices for array

In [86]: X[:, 0][Y == True]
Out[86]: matrix([[ 34.62365962,  35.84740877,  60.18259939]])

In [87]: X[:, 1][Y == True]
Out[87]: matrix([[ 78.02469282,  72.90219803,  86.3085521 ]])

In [88]: np.column_stack((x[:, 0][y == True].reshape(3,1), x[:, 1][y == True].reshape(3,1)))
Out[88]:
matrix([[ 34.62365962,  78.02469282],
[ 35.84740877,  72.90219803],
[ 60.18259939,  86.3085521 ]])

陷阱三：数值运算句法混乱

In [22]: x
Out[22]:
matrix([[  1.        ,  34.62365962,  78.02469282],
[  1.        ,  30.28671077,  43.89499752],
[  1.        ,  35.84740877,  72.90219803],
[  1.        ,  60.18259939,  86.3085521 ],
[  1.        ,  79.03273605,  75.34437644]])

In [23]: y
Out[23]:
matrix([[1],
[2],
[3],
[2],
[2]])

In [24]: theta
Out[24]:
matrix([[2],
[2],
[2]])

In [37]: x * y
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-37-ae1a0a4af750> in <module>()
----> 1 x * y

/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/matrixlib/defmatrix.pyc in __mul__(self, other)
339         if isinstance(other, (N.ndarray, list, tuple)) :
340             # This promotes 1-D vectors to row vectors
--> 341             return N.dot(self, asmatrix(other))
342         if isscalar(other) or not hasattr(other, '__rmul__') :
343             return N.dot(self, other)

ValueError: matrices are not aligned

In [38]: x[:, 0] * y
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
----> 1 x[:, 0] * y

/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/matrixlib/defmatrix.pyc in __mul__(self, other)
339         if isinstance(other, (N.ndarray, list, tuple)) :
340             # This promotes 1-D vectors to row vectors
--> 341             return N.dot(self, asmatrix(other))
342         if isscalar(other) or not hasattr(other, '__rmul__') :
343             return N.dot(self, other)

ValueError: matrices are not aligned

In [39]: sp.array(x[:,0]) * sp.array(y)
Out[39]:
array([[ 1.],
[ 2.],
[ 3.],
[ 2.],
[ 2.]])

In [42]: xy = sp.column_stack(((sp.array(x[:,0]) * sp.array(y)), (sp.array(x[:,1]) * sp.array(y)), (sp.array(x[:,2]) * sp.array(y))))

In [43]: xy
Out[43]:
array([[   1.        ,   34.62365962,   78.02469282],
[   2.        ,   60.57342154,   87.78999504],
[   3.        ,  107.54222631,  218.70659409],
[   2.        ,  120.36519878,  172.6171042 ],
[   2.        ,  158.0654721 ,  150.68875288]])

In [44]: xy * theta
Out[44]:
matrix([[ 227.29670488],
[ 300.72683316],
[ 658.4976408 ],
[ 589.96460596],
[ 621.50844996]])

In [45]: xy * sp.array(theta)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-45-5ea2f7324fbe> in <module>()
----> 1 xy * sp.array(theta)

ValueError: operands could not be broadcast together with shapes (5,3) (3,1)

In [46]: sp.dot(xy, sp.array(theta))
Out[46]:
array([[ 227.29670488],
[ 300.72683316],
[ 658.4976408 ],
[ 589.96460596],
[ 621.50844996]])

In [45] 会报错，因为在 array 里 * 运算符是点乘，而在 matrix 里 * 运算符是叉乘。如果要在 array 里算叉乘，需要用 dot 方法。看起来提供了灵活性，实际上增加了使用者的大脑负担。而我们的需求在 matlab/octave 里只需要写成 x .* y * theta ，直观优雅。

陷阱三：语法复杂，不自然

In [11]: x
Out[11]:
matrix([[ 34.62365962,  78.02469282],
[ 30.28671077,  43.89499752],
[ 35.84740877,  72.90219803],
[ 60.18259939,  86.3085521 ],
[ 79.03273605,  75.34437644]])

In [18]: sp.column_stack(((sp.ones((5,1)), x)))
Out[18]:
matrix([[  1.        ,  34.62365962,  78.02469282],
[  1.        ,  30.28671077,  43.89499752],
[  1.        ,  35.84740877,  72.90219803],
[  1.        ,  60.18259939,  86.3085521 ],
[  1.        ,  79.03273605,  75.34437644]])

结论

©️2019 CSDN 皮肤主题: 大白 设计师: CSDN官方博客