numpy 快速教程

最新推荐文章于 2024-09-21 15:57:32 发布

Zore808

最新推荐文章于 2024-09-21 15:57:32 发布

阅读量555

点赞数

这篇博客是numpy的快速入门教程，涵盖了基础知识，包括数组创建、打印、基础操作、通用函数、索引切片和迭代。还详细讲解了形状处理，如改变数组形状、堆叠不同数组和拆分数组。此外，介绍了拷贝和视图的概念，以及花式索引和线性代数的基本应用。适合numpy初学者和进阶者阅读。

摘要由CSDN通过智能技术生成

快速入门教程--- numpy

基础知识

NUMPY的主要对象是均质多维数组。它是一个元素表（通常是数字），所有类型相同，由一个正整数元组索引。在尺寸上称为轴。

In [1]:

import numpy as np

a = np.arange(15).reshape(3, 5)

Out[1]:

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [2]:

a.shape

Out[2]:

(3, 5)

In [3]:

a.ndim

Out[3]:

In [4]:

a.dtype.name

Out[4]:

'int32'

In [5]:

a.itemsize

Out[5]:

In [6]:

a.size

Out[6]:

In [7]:

type(a)

Out[7]:

numpy.ndarray

In [9]:

b = np.array([6, 7, 8])

Out[9]:

array([6, 7, 8])

In [10]:

type(b)

Out[10]:

numpy.ndarray

数组创建

有几种方法来创建数组。
例如，可以使用数组函数从常规Python列表或元组创建数组。从序列中元素的类型推断得到的数组的类型。

In [11]:

a = np.array([2,3,4])

Out[11]:

array([2, 3, 4])

In [12]:

a.dtype

Out[12]:

dtype('int32')

In [13]:

b = np.array([1.2, 3.5, 5.1])

b.dtype

Out[13]:

dtype('float64')

array()将序列序列序列转换为二维数组，将序列序列序列转换为三维数组等。

In [15]:

b = np.array([(1.5,2,3), (4,5,6)])

Out[15]:

array([[1.5, 2. , 3. ],
       [4. , 5. , 6. ]])

NUMPY提供了一些功能来创建具有初始占位符内容的数组。函数zeros()创建一个满是零的数组，函数.()创建一个满是1的数组，函数.y()创建一个初始内容是随机的并且取决于内存状态的数组。默认情况下，创建的数组的Dype是FLUAT64。

In [16]:

np.zeros( (3,4) )

Out[16]:

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [17]:

np.ones( (2,3,4), dtype=np.int16 )                # dtype can also be specified

Out[17]:

array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int16)

为了创建数字序列，NumPy提供了一个类似于.()的函数a range()，该函数返回数组而不是列表。

In [18]:

np.arange( 10, 30, 5 )

Out[18]:

array([10, 15, 20, 25])

In [19]:

np.arange( 0, 2, 0.3 )                 # it accepts float arguments

Out[19]:

array([0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])

通常最好使用函数linspace()作为参数，它接收我们想要的元素数量，而不是步骤：

In [21]:

np.linspace( 0, 2, 9 ) # 9 numbers from 0 to 2

Out[21]:

array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])

In [25]:

x = np.linspace( 0, np.pi, 5 )        # useful to evaluate function at lots of points

print(x)

f = np.sin(x)

print(f)

[0.         0.78539816 1.57079633 2.35619449 3.14159265]
[0.00000000e+00 7.07106781e-01 1.00000000e+00 7.07106781e-01
 1.22464680e-16]

打印数组

然后将一维数组打印为行，将二维数组打印为矩阵，将三维数组打印为矩阵列表。

In [27]:

a = np.arange(6)                         # 1d array

print(a)

[0 1 2 3 4 5]

In [28]:

b = np.arange(12).reshape(4,3)           # 2d array

print(b)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]

In [29]:

c = np.arange(24).reshape(2,3,4)         # 3d array

print(c)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

基础操作

数组上的算术运算符应用元素。创建一个新数组并填充结果。

In [3]:

import numpy as np

a = np.array( [20,30,40,50] )

b = np.arange( 4 )

Out[3]:

array([0, 1, 2, 3])

In [4]:

c = a-b

Out[4]:

array([20, 29, 38, 47])

In [5]:

b**2

Out[5]:

array([0, 1, 4, 9], dtype=int32)

In [7]:

10*np.sin(a)

Out[7]:

array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])

In [8]:

a<35

Out[8]:

array([ True,  True, False, False])

与许多矩阵语言不同，乘积算子*在NUMPY数组中操作元素。可以使用@运算符(在python>=3.5中)或点函数或方法来执行矩阵乘积：

In [9]:

A = np.array( [[1,1],

               [0,1]] )

B = np.array( [[2,0],

               [3,4]] )

A * B                       # elementwise product

Out[9]:

array([[2, 0],
       [0, 4]])

In [10]:

A @ B                       # matrix product

Out[10]:

array([[5, 4],
       [3, 4]])

In [11]:

A.dot(B)                    # another matrix product

Out[11]:

array([[5, 4],
       [3, 4]])

一些操作，如+=和*=，在适当的位置来修改现有的数组，而不是创建一个新的数组。

In [12]:

a = np.ones((2,3), dtype=int)

a *= 3

Out[12]:

array([[3, 3, 3],
       [3, 3, 3]])

当使用不同类型的数组进行操作时，所得到的数组的类型对应于更一般或更精确的数组（称为上传的行为）。

In [14]:

a = np.ones(3, dtype=np.int32)

b = np.linspace(0,np.pi,3)

b.dtype.name

Out[14]:

'float64'

In [15]:

c = a+b

Out[15]:

array([1.        , 2.57079633, 4.14159265])

In [16]:

c.dtype.name

Out[16]:

'float64'

许多一元操作，例如计算数组中所有元素的和，都被实现为ndarray类的方法。

In [19]:

a = np.arange(6).reshape(2,3)

Out[19]:

array([[0, 1, 2],
       [3, 4, 5]])

In [20]:

a.sum()

Out[20]:

In [21]:

a.min()

Out[21]:

In [22]:

a.max()

Out[22]:

默认情况下，这些操作应用于数组，就好像它是一个数字列表，而不管它的形状如何。但是，通过指定axis参数，可以沿数组的指定轴应用操作：

In [23]:

b = np.arange(12).reshape(3,4)

Out[23]:

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [24]:

b.sum(axis=0)                            # sum of each column

Out[24]:

array([12, 15, 18, 21])

In [25]:

b.min(axis=1)                            # min of each row

Out[25]:

array([0, 4, 8])

In [26]:

b.cumsum(axis=1)                         # cumulative sum along each row

Out[26]:

array([[ 0,  1,  3,  6],
       [ 4,  9, 15, 22],
       [ 8, 17, 27, 38]], dtype=int32)

通用函数

NUMPY在NumPy提供了熟悉的数学函数，如Sin、CoS和Exp.这些被称为“通用函数”（UFUNC）。在NUMPY中，这些函数在数组上操作元素，产生一个数组作为输出。

In [27]:

B = np.arange(3)

Out[27]:

array([0, 1, 2])

In [28]:

np.exp(B)

Out[28]:

array([1.        , 2.71828183, 7.3890561 ])

In [29]:

np.sqrt(B)

Out[29]:

array([0.        , 1.        , 1.41421356])

In [30]:

C = np.array([2., -1., 4.])

np.add(B, C)

Out[30]:

array([2., 0., 6.])

索引、切片和迭代

一维数组可以被索引、切片和迭代，非常像列表和其他Python序列。

In [31]:

a = np.arange(10)**3

Out[31]:

array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729], dtype=int32)

In [32]:

a[2:5]

Out[32]:

array([ 8, 27, 64], dtype=int32)

In [33]:

a[:6:2] = -1000    # equivalent to a[0:6:2] = -1000; from start to position 6, exclusive, set every 2nd element to -1000

Out[33]:

array([-1000,     1, -1000,    27, -1000,   125,   216,   343,   512,
         729], dtype=int32)

In [35]:

a[ : :-1]           # reversed

Out[35]:

array([  729,   512,   343,   216,   125, -1000,    27, -1000,     1,
       -1000], dtype=int32)

多维数组每轴可以有一个索引。这些索引是用逗号分隔的元组中给出的：

In [36]:

def f(x,y):

    return 10*x+y

b = np.fromfunction(f,(5,4),dtype=int)

Out[36]:

array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23],
       [30, 31, 32, 33],
       [40, 41, 42, 43]])

In [37]:

b[2,3]

Out[37]:

In [38]:

b[0:5, 1]                       # each row in the second column of b

Out[38]:

array([ 1, 11, 21, 31, 41])

In [39]:

b[ : ,1]                        # equivalent to the previous example

Out[39]:

array([ 1, 11, 21, 31, 41])

In [40]:

b[1:3, : ]                      # each column in the second and third row of b

Out[40]:

array([[10, 11, 12, 13],
       [20, 21, 22, 23]])

当提供的索引少于轴的数目时，缺失的索引被认为是完整的切片：

In [41]:

b[-1]                                  # the last row. Equivalent to b[-1,:]

Out[41]:

array([40, 41, 42, 43])

当提供的索引少于轴的数目时，缺失的索引被认为是完整的切片：

x[1,2,...] = x[1,2,:,:,:], x[...,3] to x[:,:,:,:,3] and x[4,...,5,:] to x[4,:,:,5,:].

In [42]:

c = np.array( [[[  0,  1,  2],               # a 3D array (two stacked 2D arrays)

                [ 10, 12, 13]],

               [[100,101,102],

                [110,112,113]]])

c.shape

Out[42]:

(2, 2, 3)

In [43]:

c[1,...]                                   # same as c[1,:,:] or c[1]

Out[43]:

array([[100, 101, 102],
       [110, 112, 113]])

In [44]:

c[...,2]                                   # same as c[:,:,2]

Out[44]:

array([[  2,  13],
       [102, 113]])

对多维数组进行迭代相对于第一轴进行：

In [45]:

for row in b:

    print(row)

[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]

然而，如果希望对数组中的每个元素执行操作，则可以使用平面属性，该属性是对数组的所有元素的迭代器：

In [46]:

for element in b.flat:

    print(element,end=' ')

0 1 2 3 10 11 12 13 20 21 22 23 30 31 32 33 40 41 42 43

形状处理

改变数组的形状

一个数组的形状由每个轴上的元素的数量给出：

In [2]:

import numpy as np

a = np.floor(10*np.random.random((3,4)))

Out[2]:

array([[1., 3., 0., 3.],
       [7., 8., 3., 0.],
       [3., 9., 1., 8.]])

In [3]:

a.shape

Out[3]:

(3, 4)

数组的形状可以用不同的命令来改变。注意，以下三个命令都返回修改后的数组，但不更改原始数组数据：

In [4]:

a.ravel()  # returns the array, flattened

Out[4]:

array([1., 3., 0., 3., 7., 8., 3., 0., 3., 9., 1., 8.])

In [5]:

a.reshape(6,2)  # returns the array with a modified shape

Out[5]:

array([[1., 3.],
       [0., 3.],
       [7., 8.],
       [3., 0.],
       [3., 9.],
       [1., 8.]])

In [6]:

a.T  # returns the array, transposed

Out[6]:

array([[1., 7., 3.],
       [3., 8., 9.],
       [0., 3., 1.],
       [3., 0., 8.]])

In [7]:

a.T.shape

Out[7]:

(4, 3)

如果在整形操作中将尺寸指定为-1，则自动计算其他尺寸：

In [8]:

a.reshape(3,-1)

Out[8]:

array([[1., 3., 0., 3.],
       [7., 8., 3., 0.],
       [3., 9., 1., 8.]])

RESHAPE（）函数用修改的形状返回它的参数，而NDARRA.Resie（）方法修改数组本身：

In [10]:

a.resize((2,6))

Out[10]:

array([[1., 3., 0., 3., 7., 8.],
       [3., 0., 3., 9., 1., 8.]])

堆叠不同阵列

几个阵列可以沿不同的轴线堆叠在一起：

In [11]:

a = np.floor(10*np.random.random((2,2)))

Out[11]:

array([[7., 5.],
       [6., 3.]])

In [12]:

b = np.floor(10*np.random.random((2,2)))

Out[12]:

array([[6., 1.],
       [6., 2.]])

In [13]:

np.vstack((a,b))

Out[13]:

array([[7., 5.],
       [6., 3.],
       [6., 1.],
       [6., 2.]])

In [14]:

np.hstack((a,b))

Out[14]:

array([[7., 5., 6., 1.],
       [6., 3., 6., 2.]])

函数CyrnNoStAcd（）将1D数组列为2D数组。它相当于HSTACK（）只用于2D阵列：

In [15]:

np.column_stack((a,b))     # with 2D arrays

Out[15]:

array([[7., 5., 6., 1.],
       [6., 3., 6., 2.]])

In [16]:

a = np.array([4.,2.])

b = np.array([3.,8.])

np.column_stack((a,b))     # returns a 2D array

Out[16]:

array([[4., 3.],
       [2., 8.]])

In [17]:

np.hstack((a,b))           # the result is different from column_stack()

Out[17]:

array([4., 2., 3., 8.])

把一个数组分成几个小数组

使用hsplit()，您可以通过指定要返回的相同形状的数组的数量，或者通过指定应该进行除法的列，来沿着数组的水平轴分割数组：

In [18]:

a = np.floor(10*np.random.random((2,12)))

Out[18]:

array([[4., 0., 8., 2., 8., 9., 8., 9., 9., 9., 1., 9.],
       [6., 5., 7., 6., 0., 6., 4., 6., 3., 2., 4., 7.]])

In [19]:

np.hsplit(a,3)   # Split a into 3 arrays

Out[19]:

[array([[4., 0., 8., 2.],
        [6., 5., 7., 6.]]), array([[8., 9., 8., 9.],
        [0., 6., 4., 6.]]), array([[9., 9., 1., 9.],
        [3., 2., 4., 7.]])]

拷贝和视图

当操作和操作数组时，它们的数据有时被复制到一个新的数组中，有时没有。这往往是初学者的困惑源。有三种情况：

根本没有复制品

简单的赋值不复制数组对象或它们的数据。

In [20]:

a = np.arange(12)

b = a            # no new object is created

b is a           # a and b are two names for the same ndarray object

Out[20]:

True

In [21]:

b.shape = 3,4    # changes the shape of a

a.shape

Out[21]:

(3, 4)

Python将可变对象作为引用传递，因此函数调用不复制。

In [23]:

def f(x):

    print(id(x))

print(id(a))                           # id is a unique identifier of an object

f(a)

97782640
97782640

视图或浅拷贝

不同的数组对象可以共享相同的数据。VIEW（）方法创建一个新的数组对象，该对象查看相同的数据。

In [24]:

c = a.view()

c is a

Out[24]:

False

In [25]:

c.base is a                        # c is a view of the data owned by a

Out[25]:

True

In [26]:

c.shape = 2,6                      # a's shape doesn't change

a.shape

Out[26]:

(3, 4)

In [27]:

c[0,4] = 999                      # a's data changes

Out[27]:

array([[  0,   1,   2,   3],
       [999,   5,   6,   7],
       [  8,   9,  10,  11]])

切片数组返回它的视图：

In [29]:

s = a[ : , 1:3]

s[:] = 10

Out[29]:

array([[  0,  10,  10,   3],
       [999,  10,  10,   7],
       [  8,  10,  10,  11]])

深拷贝

复制（）方法完成数组及其数据的完整复制。

In [30]:

d = a.copy()                          # a new array object with new data is created

d is a

Out[30]:

False

In [31]:

d.base is a

Out[31]:

False

In [32]:

d[0,0] = 888

Out[32]:

array([[  0,  10,  10,   3],
       [999,  10,  10,   7],
       [  8,  10,  10,  11]])

花式索引和索引技巧

NumPy提供了比常规Python序列更多的索引功能。除了按整数和片建立索引外，正如我们前面看到的，数组还可以按整数数组和布尔值数组建立索引。

索引数组索引

In [33]:

a = np.arange(12)**2                       # the first 12 square numbers

i = np.array( [ 1,1,3,8,5 ] )              # an array of indices

a[i]

Out[33]:

array([ 1,  1,  9, 64, 25], dtype=int32)

In [34]:

j = np.array( [ [ 3, 4], [ 9, 7 ] ] )      # a bidimensional array of indices

a[j]                                       # the same shape as j

Out[34]:

array([[ 9, 16],
       [81, 49]], dtype=int32)

当索引数组a是多维数组时，单个索引数组引用a的第一个维度。

In [35]:

palette = np.array( [ [0,0,0],                # black

                     [255,0,0],              # red

                     [0,255,0],              # green

                     [0,0,255],              # blue

                     [255,255,255] ] )       # white

image = np.array( [ [ 0, 1, 2, 0 ],           # each value corresponds to a color in the palette

                   [ 0, 3, 4, 0 ]  ] )

palette[image]

Out[35]:

array([[[  0,   0,   0],
        [255,   0,   0],
        [  0, 255,   0],
        [  0,   0,   0]],

       [[  0,   0,   0],
        [  0,   0, 255],
        [255, 255, 255],
        [  0,   0,   0]]])

您还可以使用数组索引作为要分配的目标

In [37]:

a = np.arange(5)

Out[37]:

array([0, 1, 2, 3, 4])

In [38]:

a[[1,3,4]] = 9

Out[38]:

array([0, 9, 2, 9, 9])

布尔数组索引

当我们用(整数)索引数组索引数组时，我们提供了要选择的索引列表。布尔索引的方法是不同的;我们显式地选择数组中的哪些项是我们想要的，哪些是我们不想要的。

对于布尔索引，最自然的方法是使用与原始数组形状相同的布尔数组:

In [40]:

a = np.arange(12).reshape(3,4)

b = a > 4

b                                          # b is a boolean with a's shape

Out[40]:

array([[False, False, False, False],
       [False,  True,  True,  True],
       [ True,  True,  True,  True]])

In [41]:

a[b]                                       # 1d array with the selected elements

Out[41]:

array([ 5,  6,  7,  8,  9, 10, 11])

这个属性在工作中非常有用:

In [42]:

a[b] = 0                                   # All elements of 'a' higher than 4 become 0

Out[42]:

array([[0, 1, 2, 3],
       [4, 0, 0, 0],
       [0, 0, 0, 0]])

利用布尔值进行索引的第二种方法更类似于整数索引;对于数组的每个维度，我们给出一个一维布尔数组，选择我们想要的切片:

In [44]:

a = np.arange(12).reshape(3,4)

b1 = np.array([False,True,True])             # first dim selection

b2 = np.array([True,False,True,False])       # second dim selection

a[b1,:]                                   # selecting rows

Out[44]:

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [45]:

a[b1]                                     # same thing

Out[45]:

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [46]:

a[:,b2]                                   # selecting columns

Out[46]:

array([[ 0,  2],
       [ 4,  6],
       [ 8, 10]])

In [47]:

a[b1,b2]                                  # a weird thing to do

Out[47]:

array([ 4, 10])

线性代数

基本的线性代数将包括在这里。

In [48]:

a = np.array([[1.0, 2.0], [3.0, 4.0]])

Out[48]:

array([[1., 2.],
       [3., 4.]])

In [49]:

a.transpose()

Out[49]:

array([[1., 3.],
       [2., 4.]])

In [50]:

np.linalg.inv(a)

Out[50]:

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

In [51]:

u = np.eye(2) # unit 2x2 matrix; "eye" represents "I"

Out[51]:

array([[1., 0.],
       [0., 1.]])

In [52]:

j = np.array([[0.0, -1.0], [1.0, 0.0]])

j @ j        # matrix product

Out[52]:

array([[-1.,  0.],
       [ 0., -1.]])

In [53]:

np.trace(u)  # trace

Out[53]:

2.0

In [54]:

y = np.array([[5.], [7.]])

np.linalg.solve(a, y)

Out[54]:

array([[-3.],
       [ 4.]])

In [56]:

np.linalg.eig(a)

Out[56]:

(array([-0.37228132,  5.37228132]), array([[-0.82456484, -0.41597356],
        [ 0.56576746, -0.90937671]]))

柱状图

作用于数组的NumPy直方图()函数返回一对向量:数组的直方图和容器的向量。注意:matplotlib还有一个构建直方图的函数(在Matlab中称为hist())，与NumPy中的函数不同。主要区别在于，pylab.hist()自动绘制直方图，而numpy.histogram()只生成数据。

In [60]:

import numpy as np

import matplotlib.pyplot as plt

%matplotlib inline

# Build a vector of 10000 normal deviates with variance 0.5^2 and mean 2

mu, sigma = 2, 0.5

v = np.random.normal(mu,sigma,10000)

# Plot a normalized histogram with 50 bins

plt.hist(v, bins=50, density=1)       # matplotlib version (plot)

plt.show()

In [61]:

# Compute the histogram with numpy and then plot it

(n, bins) = np.histogram(v, bins=50, density=True)  # NumPy version (no plot)

plt.plot(.5*(bins[1:]+bins[:-1]), n)

plt.show()

Zore808

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫