《利用Python 进行数据分析》 - 笔记(2)

问题导读:

1.NumPy 的 ndarray:一种多维数数组对象
2.通用函数 - 快速的元素级数组函数
3.利用数组进行数据分析
4.用于数组的文件输入输出
5.线性代数
6.随机数生成




解决方案:


NumPy 的 ndarray:


(1)简介:

ndarray是numpy 的一个N维数组对象,该对象是一个快速而灵活的大数据集容器。我们可以利用这个数据结构对整块数据执行一些数学运算,


In [13]: data = np.random.rand(2,3)

In [14]: data
Out[14]: 
array([[ 0.37049662,  0.29125076,  0.62865859],
       [ 0.52419945,  0.68931012,  0.20101274]])

In [15]: data * 10
Out[15]: 
array([[ 3.70496616,  2.91250765,  6.2865859 ],
       [ 5.24199448,  6.89310123,  2.01012736]])

ndarray是一个通用的 同构数据多维容器,其中的所有 元素必须是相同类型的。每个数组都有一个 shape(表示各维度大小的元组)和一个 dtype(一个用于说明数组数据类型的对象)。


In [16]: data.shape
Out[16]: (2, 3)

In [17]: data.dtype
Out[17]: dtype('float64')

(2)创建ndarray


  • 使用array 函数。它可以接受一切序列型的对象(包括其他数组),然后产生一个ndarray数组,

In [18]: data1 = [2,2.5,10,0,7]

In [19]: arr1 = np.array(data1)

In [20]: arr1
Out[20]: array([  2. ,   2.5,  10. ,   0. ,   7. ])

In [21]: data2 = [[1,1,1],[2,2,2],[3,3,3]]

In [22]: arr2 = np.array(data2)

In [23]: arr2
Out[23]: 
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

除非显示的说明否则np.array 会尝试为新建的这个数组推断出一个较位合适的数据类型 保存在dtype 对象中,

In [33]: arr1.shape
Out[33]: (5,)

In [34]: arr1.shape
Out[34]: (5,)

In [35]: arr1.dtype
Out[35]: dtype('float64')

In [36]: arr1.ndim
Out[36]: 1

In [37]: arr2.shape
Out[37]: (3, 3)

In [38]: arr2.dtype
Out[38]: dtype('int64')

In [39]: arr2.ndim
Out[39]: 2

  • zeros 可以创建全0数组
  • ones 可以创建全1数组
  • empty 可以创建一个没有任何具体值的数组(未初始化的垃圾值)
我们只需要传入一个表示形状的元组即可,

In [40]: np.zeros(10)
Out[40]: array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

In [41]: np.zeros((3,2))
Out[41]: 
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])

In [42]: np.ones((2,3))
Out[42]: 
array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [43]: np.empty((1,1,1))
Out[43]: array([[[  6.90576859e-310]]])

  • arange 相当于内置的range,但返回的是一个ndarray 而不是列表
In [54]: arange(1,10)
Out[54]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])


(3)ndarray 的数据类型

  • 在创建时声明
In [57]: arr1 = np.array([1,2,3],dtype=np.float64)

In [58]: arr2 = np.array([1,2,3],dtype=np.string_)

In [59]: arr1.dtype
Out[59]: dtype('float64')

In [60]: arr2.dtype
Out[60]: dtype('S1')

In [61]: arr2
Out[61]: 
array(['1', '2', '3'], 
      dtype='|S1')



NumPy 数据类型:

object                          O                                Python对象类型
string_                        S                                  固定长度的字符串类型


  • 通过ndarray 的 astype 方法显示地转换其dtype
In [63]: int_arr = np.array([1,2,3,4,])

In [64]: int_arr.dtype
Out[64]: dtype('int64')

In [65]: float_arr = int_arr.astype(np.float64)

In [66]: float_arr.dtype
Out[66]: dtype('float64')

In [67]: float_arr
Out[67]: array([ 1.,  2.,  3.,  4.])

In [68]: int_arr
Out[68]: array([1, 2, 3, 4])


  • 浮点型转换成整型将折断小数点;在转换过程中因为某种原因而失败了,就会引发一个TypeError
In [75]: float_arr = np.array([1.111,2.222,3.333])

In [76]: float_arr
Out[76]: array([ 1.111,  2.222,  3.333])

In [77]: int_arr = float_arr.astype(np.int32)

In [78]: int_arr
Out[78]: array([1, 2, 3], dtype=int32)

  • astype无论如何都会创建出一个新的数组,即使新dtype 与 原来的 dtype 相同也是如此
In [79]: int_arr.astype(float_arr.dtype)
Out[79]: array([ 1.,  2.,  3.])

In [80]: float_arr.astype(int_arr.dtype)
Out[80]: array([1, 2, 3], dtype=int32)

(4)数组与标量之间的运算

我们不需要编写循环即可对数据执行批量操作,这样通常叫做矢量化,大小相等的数组之间的任何算数运算都会将运算应用到元素级,

In [24]: arr
Out[24]: 
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

In [25]: arr + arr
Out[25]: 
array([[  2.,   4.,   6.],
       [  8.,  10.,  12.]])

In [26]: arr / arr
Out[26]: 
array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [27]: 1/arr
Out[27]: 
array([[ 1.        ,  0.5       ,  0.33333333],
       [ 0.25      ,  0.2       ,  0.16666667]])

In [28]: arr ** 0.5
Out[28]: 
array([[ 1.        ,  1.41421356,  1.73205081],
       [ 2.        ,  2.23606798,  2.44948974]])

不同大小的数组之间的运算叫做广播,之后会介绍。

(5)基本的索引和切片


当标准值赋值给一个切片时,该值会自动传播(广播)到整个选区。跟列表的最重要的区别在于,数组切片是原始数组的视图。


In [29]: arr = np.arange(10)

In [30]: arr
Out[30]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [31]: arr[5]
Out[31]: 5

In [32]: arr[5:8]
Out[32]: array([5, 6, 7])

In [33]: arr[:-3] = 0

In [34]: arr
Out[34]: array([0, 0, 0, 0, 0, 0, 0, 7, 8, 9])

In [35]: arr = range(1,10)

In [36]: arr
Out[36]: [1, 2, 3, 4, 5, 6, 7, 8, 9]

In [37]: arr.dtype
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-37-5d17617c092d> in <module>()
----> 1 arr.dtype

AttributeError: 'list' object has no attribute 'dtype'

In [38]: arr
Out[38]: [1, 2, 3, 4, 5, 6, 7, 8, 9]

In [39]: arr[3:5]
Out[39]: [4, 5]

In [40]: arr[3:5] = 0
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-40-449b29d2513c> in <module>()
----> 1 arr[3:5] = 0

TypeError: can only assign an iterable


数组切片是原始数组的视图,数据不会被复制,视图上的任何修改都会直接映射到原数组上,


In [59]: arr
Out[59]: array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

In [60]: arr_slice = arr[5:8]

In [61]: arr_slice
Out[61]: array([12, 12, 12])

In [62]: arr_slice[1] = 10010

In [63]: arr
Out[63]: array([    0,     1,     2,     3,     4,    12, 10010,    12,     8,     9])

In [64]: arr_slice[:] = 10086

In [65]: arr
Out[65]: array([    0,     1,     2,     3,     4, 10086, 10086, 10086,     8,


如想得到的是ndarray切片的一份副本而非视图,就要显试的复制操作,


In [70]: arr[5:8].copy()
Out[70]: array([10086, 10086, 10086])

高维数组,

In [74]: arr2d = np.array([[1,2,3],[4,5,6],[7,8,9]])

In [75]: arr2d
Out[75]: 
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [76]: arr2d[2]
Out[76]: array([7, 8, 9])

In [77]: arr2d[0][2]
Out[77]: 3

In [78]: arr2d[0,2]
Out[78]: 3

切片索引,

In [79]: arr2d
Out[79]: 
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [80]: arr2d[:2,1:]
Out[80]: 
array([[2, 3],
       [5, 6]])

In [81]: arr2d[:,:1]
Out[81]: 
array([[1],
       [4],
       [7]])

(6)布尔型索引

In [83]: names = np.array(['a','b','c','d','e'])

In [84]: data = randn(5,3)

In [85]: names
Out[85]: 
array(['a', 'b', 'c', 'd', 'e'], 
      dtype='|S1')

In [86]: data
Out[86]: 
array([[-0.11552192, -1.02001624, -2.41543696],
       [-0.71050235, -1.05874464, -0.44239117],
       [ 0.75511336,  1.5671725 ,  0.26329833],
       [-1.18434069, -0.4040645 ,  0.20707161],
       [-0.1109921 ,  0.77768241, -0.74954289]])

In [87]: data[names == 'b']
Out[87]: array([[-0.71050235, -1.05874464, -0.44239117]])

In [88]: data[names == 'b',2]
Out[88]: array([-0.44239117])
In [90]: data[names != 'b',2]
Out[90]: array([-2.41543696,  0.26329833,  0.20707161, -0.74954289])
综合应用多个布尔条件。
In [97]: data[(names != 'b') & (names != 'c')]
Out[97]: 
array([[-0.11552192, -1.02001624, -2.41543696],
       [-1.18434069, -0.4040645 ,  0.20707161],
       [-0.1109921 ,  0.77768241, -0.74954289]])

In [98]: data[(names != 'b') | (names != 'c')]
Out[98]: 
array([[-0.11552192, -1.02001624, -2.41543696],
       [-0.71050235, -1.05874464, -0.44239117],
       [ 0.75511336,  1.5671725 ,  0.26329833],
       [-1.18434069, -0.4040645 ,  0.20707161],
       [-0.1109921 ,  0.77768241, -0.74954289]])

In [99]: data
Out[99]: 
array([[-0.11552192, -1.02001624, -2.41543696],
       [-0.71050235, -1.05874464, -0.44239117],
       [ 0.75511336,  1.5671725 ,  0.26329833],
       [-1.18434069, -0.4040645 ,  0.20707161],
       [-0.1109921 ,  0.77768241, -0.74954289]])


(7)花式索引


利用整数数组去索引,

In [110]: arr = np.empty((8,4))

In [111]: for i in range(8):
    arr[i] = i
   .....:     

In [112]: arr[[4,0,3,7]]
Out[112]: 
array([[ 4.,  4.,  4.,  4.],
       [ 0.,  0.,  0.,  0.],
       [ 3.,  3.,  3.,  3.],
       [ 7.,  7.,  7.,  7.]])

In [113]: arr = np.arange(16).reshape((4,4))

In [114]: arr
Out[114]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [115]: arr[[1,0,2],[2,0,1]]
Out[115]: array([6, 0, 9])

花式索引和切片不一样,它总是将数据复制到新数组中。

In [116]: arr[[1,0,2]][:,[2,0,1]]
Out[116]: 
array([[ 6,  4,  5],
       [ 2,  0,  1],
       [10,  8,  9]])

In [118]: arr[np.ix_([1,0,2],[2,0,1])]
Out[118]: 
array([[ 6,  4,  5],
       [ 2,  0,  1],
       [10,  8,  9]])

In [119]: np.ix_([1,0,2],[2,0,1])
Out[119]: 
(array([[1],
       [0],
       [2]]), array([[2, 0, 1]]))

(8)数组转置和轴对换


In [121]: arr
Out[121]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [122]: arr.T
Out[122]: 
array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

In [123]: np.dot(arr.T,arr)
Out[123]: 
array([[ 80,  92, 104, 116],
       [ 92, 107, 122, 137],
       [104, 122, 140, 158],
       [116, 137, 158, 179]])


transpose 需要得到一个由轴编号组成的元组才能对这些轴进行转置,


In [124]: arr = np.arange(16).reshape((2,2,4))

In [125]: arr
Out[125]: 
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]]])

In [126]: arr.transpose((1,0,2))
Out[126]: 
array([[[ 0,  1,  2,  3],
        [ 8,  9, 10, 11]],

       [[ 4,  5,  6,  7],
        [12, 13, 14, 15]]])

In [127]: arr.swapaxes(0,1)
Out[127]: 
array([[[ 0,  1,  2,  3],
        [ 8,  9, 10, 11]],

       [[ 4,  5,  6,  7],
        [12, 13, 14, 15]]])

两个函数都是返回原数据的视图,不会进行任何复制操作。



通用函数的介绍: 


这些函数都是元素级的函数,使用矢量化包装器。 

  • 一元函数

In [2]: arr = np.arange(10)

In [3]: arr
Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [4]: np.sqrt(arr)
Out[4]: 
array([ 0.        ,  1.        ,  1.41421356,  1.73205081,  2.        ,
        2.23606798,  2.44948974,  2.64575131,  2.82842712,  3.        ])

In [5]: np.exp(arr)
Out[5]: 
array([  1.00000000e+00,   2.71828183e+00,   7.38905610e+00,
         2.00855369e+01,   5.45981500e+01,   1.48413159e+02,
         4.03428793e+02,   1.09663316e+03,   2.98095799e+03,
         8.10308393e+03])

  • 二元函数
(1)maximum 求得元素级别最大值

In [24]: arr = randn(7)*5

In [25]: np.modf(arr)
Out[25]: 
(array([-0.38426056, -0.88663126,  0.54868246,  0.69865541, -0.8307569 ,
       -0.07793903,  0.07893149]),
 array([ -2.,  -3.,  13.,   0.,  -1.,  -4.,   6.]))



(2)modf 返回的是小数部分 和 整数部分的两个数组


In [24]: arr = randn(7)*5

In [25]: np.modf(arr)
Out[25]: 
(array([-0.38426056, -0.88663126,  0.54868246,  0.69865541, -0.8307569 ,
       -0.07793903,  0.07893149]),
 array([ -2.,  -3.,  13.,   0.,  -1.,  -4.,   6.]))

(3)ufunc表






利用数组进行数据分析:

  • meshgrid
接受两个一维数组并产生两个二维矩阵(数组与矩阵相对应,第一个数组对应第一个矩阵x轴;第二个数组对应第二个矩阵y轴)

In [31]: points = np.arange(-5,5,0.01)

In [32]: xs, ys = np.meshgrid(points,points)

In [33]: xs
Out[33]: 
array([[-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       ..., 
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99]])

In [34]: ys
Out[34]: 
array([[-5.  , -5.  , -5.  , ..., -5.  , -5.  , -5.  ],
       [-4.99, -4.99, -4.99, ..., -4.99, -4.99, -4.99],
       [-4.98, -4.98, -4.98, ..., -4.98, -4.98, -4.98],
       ..., 
       [ 4.97,  4.97,  4.97, ...,  4.97,  4.97,  4.97],
       [ 4.98,  4.98,  4.98, ...,  4.98,  4.98,  4.98],
       [ 4.99,  4.99,  4.99, ...,  4.99,  4.99,  4.99]])

In [35]: z = np.sqrt(xs ** 2 + ys ** 2)

In [36]: z
Out[36]: 
array([[ 7.07106781,  7.06400028,  7.05693985, ...,  7.04988652,
         7.05693985,  7.06400028],
       [ 7.06400028,  7.05692568,  7.04985815, ...,  7.04279774,
         7.04985815,  7.05692568],
       [ 7.05693985,  7.04985815,  7.04278354, ...,  7.03571603,
         7.04278354,  7.04985815],
       ..., 
       [ 7.04988652,  7.04279774,  7.03571603, ...,  7.0286414 ,
         7.03571603,  7.04279774],
       [ 7.05693985,  7.04985815,  7.04278354, ...,  7.03571603,
         7.04278354,  7.04985815],
       [ 7.06400028,  7.05692568,  7.04985815, ...,  7.04279774,
         7.04985815,  7.05692568]])

利用matplotlib 的 imshow 函数图像化数据,

In [43]: plt.imshow(z, cmap=plt.cm.gray);plt.colorbar()
Out[43]: <matplotlib.colorbar.Colorbar instance at 0x7f49b65dcd88>




  • 将条件逻辑表述维数组运算
(1) 两个值数组,一个布尔数组,通过布尔数组中的数据去取值

In [45]: xarr = np.array([1.1,1.2,1.3,1.4,1.5])

In [46]: yarr = np.array([2.1,2.2,2.3,2.4,2.5])

In [47]: cond = np.array([True,False,True,True,False])

In [48]: result = np.where(cond,xarr,yarr)

In [49]: result
Out[49]: array([ 1.1,  2.2,  1.3,  1.4,  2.5])


(2)传入数据中包含标量

In [51]: arr = randn(4,4)

In [52]: arr
Out[52]: 
array([[-0.69197635, -0.16654029, -0.404911  , -1.15366779],
       [-0.75540129,  0.4264011 ,  0.51838994, -1.17759759],
       [-0.02668043,  1.29681056, -0.21656743, -0.37266106],
       [-0.29146894,  0.30800829,  0.29399253,  2.15750223]])

In [53]: np.where(arr > 0, 2, -2)
Out[53]: 
array([[-2, -2, -2, -2],
       [-2,  2,  2, -2],
       [-2,  2, -2, -2],
       [-2,  2,  2,  2]])

In [54]: np.where(arr > 0, 2, arr)
Out[54]: 
array([[-0.69197635, -0.16654029, -0.404911  , -1.15366779],
       [-0.75540129,  2.        ,  2.        , -1.17759759],
       [-0.02668043,  2.        , -0.21656743, -0.37266106],
       [-0.29146894,  2.        ,  2.        ,  2.        ]])


(3)通过两个布尔数组的组合列出【1,2,3,4】四种结果

In [59]: np.where(cond01 & cond02, 0, np.where(cond01,1,np.where(cond02,2,3)))
Out[59]: array([0, 3])

  • 数据和统计方法
(1)基本数组统计方法:



(2)除了正常操作之外还可一进行维度上的计算,返回的少了一个维度的数组。

In [3]: arr = np.random.randn(5,4)

In [4]: arr
Out[4]: 
array([[-0.39363702, -0.58032882, -0.61961903,  0.65173318],
       [-2.82998064, -1.29573389,  1.26868564, -1.57221668],
       [-0.25741113,  1.11530682,  2.52294025, -0.2152994 ],
       [-1.35972807, -1.20833599, -1.39896103, -1.61382468],
       [-1.98407984, -0.56897573, -1.22872402, -0.95416608]])

In [5]: arr.std()
Out[5]: 1.2145328715136516

In [6]: arr.mean()
Out[6]: -0.6261178077018037

In [7]: arr.mean(axis = 0)
Out[7]: array([-1.36496734, -0.50761352,  0.10886436, -0.74075473])

In [8]: arr.sum(axis = 0)
Out[8]: array([-6.8248367 , -2.53806761,  0.54432182, -3.70377366])

In [9]: arr.sum(axis = 1)
Out[9]: array([-0.94185169, -4.42924556,  3.16553654, -5.58084977, -4.73594568])

  • 用于布尔数组的方法
(1)sum经常被用来对布尔型数组中的True值计数

In [10]: arr = randn(100)

In [11]: (arr>0).sum()
Out[11]: 53

In [12]: arr.sum()
Out[12]: 0.48998201181042778

In [13]: (arr>0)
Out[13]: 
array([ True, False,  True,  True,  True,  True, False,  True, False,
       False, False, False, False,  True, False,  True,  True,  True,
       False, False,  True,  True,  True,  True, False, False,  True,
        True,  True, False,  True, False, False, False, False,  True,
        True,  True,  True, False,  True,  True,  True,  True,  True,
        True,  True,  True, False,  True,  True, False,  True, False,
        True,  True,  True, False,  True,  True, False,  True, False,
       False, False,  True,  True, False, False, False, False, False,
        True, False,  True, False,  True, False, False,  True,  True,
        True, False,  True, False, False,  True,  True, False,  True,
       False, False, False, False,  True, False, False,  True, False, False], dtype=bool)

(2)any 测试数组中是否存在一个或者多个True; all 测试数组中是否全部为True

n [14]: bools = np.array([False,False,True,False])

In [15]: bools.any()
Out[15]: True

In [16]: bools.all()
Out[16]: False

  • 排序

(1)简单的排序

In [8]: arr
Out[8]: 
array([ 0.44042138,  0.27107379, -0.35692384,  0.95729632, -0.94924333,
        0.50478958,  0.88257156, -1.65782387])

In [9]: arr.sort()

In [10]: arr
Out[10]: 
array([-1.65782387, -0.94924333, -0.35692384,  0.27107379,  0.44042138,
        0.50478958,  0.88257156,  0.95729632])



(2)对选中的轴进行排序

In [11]: arr = randn(5,3)

In [12]: arr
Out[12]: 
array([[-0.82758627,  0.66299892, -0.38329584],
       [-1.64121696,  0.39644587, -0.35494659],
       [-1.25550146,  0.92088691,  0.24278241],
       [-1.20775834, -0.69017047,  0.08544922],
       [ 0.16352949, -0.11005891,  0.3249279 ]])

In [14]: arr.sort(1)

In [15]: arr
Out[15]: 
array([[-0.82758627, -0.38329584,  0.66299892],
       [-1.64121696, -0.35494659,  0.39644587],
       [-1.25550146,  0.24278241,  0.92088691],
       [-1.20775834, -0.69017047,  0.08544922],
       [-0.11005891,  0.16352949,  0.3249279 ]])

In [16]: arr.sort(0)

In [17]: arr
Out[17]: 
array([[-1.64121696, -0.69017047,  0.08544922],
       [-1.25550146, -0.38329584,  0.3249279 ],
       [-1.20775834, -0.35494659,  0.39644587],
       [-0.82758627,  0.16352949,  0.66299892],
       [-0.11005891,  0.24278241,  0.92088691]])

(3)np.sort() 返回的数组的已排序副本;sort 的就地排序会修改数组的本身

  • 唯一化以及其他的集合逻辑

(1)np.unique 找出数组中的唯一值并返回已排序的结果

In [39]: names = np.array(['Bob','Joe','Will','Bob','Will','Joe','Joe'])

In [40]: np.unique(names)
Out[40]: 
array(['Bob', 'Joe', 'Will'], 
      dtype='|S4')


原生python 代码:

In [43]: sorted(set(names))
Out[43]: ['Bob', 'Joe', 'Will']

(2)np.in1d 测试一个数组中的值是否在另个数组中出现过,返回一个布尔数组

In [45]: values = np.array([6,0,0,3,2,5,6])

In [46]: np.in1d(values,[2,3,6])
Out[46]: array([ True, False, False,  True,  True, False,  True], dtype=bool)

(3)数组的集合运算




用于文件的输入输出:

  • 将数组以二进制格式保存到磁盘
(1)save 将原始二进制格式保存在扩展名为.npy 的文件中

In [50]: arr
Out[50]: 
array([-1.72684184,  1.44283203, -0.38139619, -0.87512866,  1.19248012,
       -0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931])

In [51]: np.save('some_array',arr)

In [52]: ll
total 44
-rw-r--r-- 1 peerslee   87  4月 30 18:04 ma6174
drwxrw-r-- 5 peerslee 4096  4月 30 20:22 opt/
-rw-r--r-- 1 peerslee  160  5月  3 17:39 some_array.npy

(2)savez将多个数组保存到一个压缩文件当中

In [61]: np.savez('array_archive.npz',a=arr,b=arr)
In [66]: arch = np.load('array_archive.npz')


In [67]: arch['b']
Out[67]: 
array([-1.72684184,  1.44283203, -0.38139619, -0.87512866,  1.19248012,
       -0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931])


  • 存取文本文件
(1)取
In [48]: cat arr*.txt
-1.72684184,1.44283203,-0.38139619,-0.87512866,1.19248012
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931
-0.69037504,-0.54342753,0.92962819,1.37471742,-0.43971931

In [49]: arr = np.loadtxt('array_ex.txt',delimiter=',')

In [50]: arr
Out[50]: 
array([[-1.72684184,  1.44283203, -0.38139619, -0.87512866,  1.19248012],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931],
       [-0.69037504, -0.54342753,  0.92962819,  1.37471742, -0.43971931]])


(2)存

In [65]: arr = random.randn(10)

In [66]: arr
Out[66]: 
array([ 1.01909753,  1.50819412, -0.49293385,  0.30151565, -0.10602071,
        0.09623269, -1.14679074,  1.30595798, -0.34700756,  0.13555792])

In [67]: np.savetxt('array01.txt',arr,delimiter = '\n')

In [68]: cat array01.txt
1.019097530090067094e+00
1.508194120473395072e+00
-4.929338473000169363e-01
3.015156466063917406e-01
-1.060207072744920320e-01
9.623268773502001439e-02
-1.146790738344216187e+00
1.305957976175855517e+00
-3.470075635778481771e-01
1.355579200596107037e-01



线性代数:

  • 点积
In [78]: x
Out[78]: 
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

In [79]: y
Out[79]: 
array([[  6.,  23.],
       [ -1.,   7.],
       [  8.,   9.]])

In [80]: x.dot(y)
Out[80]: 
array([[  28.,   64.],
       [  67.,  181.]])

In [87]: np.dot(x,np.ones(3))
Out[87]: array([  6.,  15.])

In [88]: x
Out[88]: 
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

  • 常用的numpy.linalg 函数




随机数生成:

  • normal 得到一个标准正太分布的 4*4 的样本数组
In [90]: samples = np.random.normal(size=(4,4))

In [91]: samples
Out[91]: 
array([[ 0.65434274,  2.55328917, -0.16027389, -0.14723887],
       [-0.87555718,  0.33198742,  0.17313982,  1.34232683],
       [ 0.63119381,  0.86120016,  0.27243808,  0.12564236],
       [ 0.76887308,  1.62427568,  2.27735502, -0.56002867]])

  • numpy.random 函数





关联文章:




评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

PeersLee

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值