NUMPY练习

最新推荐文章于 2024-09-09 11:20:35 发布

潘诺西亚的火山

最新推荐文章于 2024-09-09 11:20:35 发布

阅读量276

点赞数

文章标签：机器学习

本文链接：https://blog.csdn.net/helldoger/article/details/107399353

版权

numpy 是 python科学计算的核心库。PYTHON里涉及到科学计算的包括Pandas,sklearn等都是基于numpy进行二次开发包装的。numpy功能非常强大，其余scipy构建了强大的PYTHON数理计算功能，函数接口丰富复杂。

对于本次课程来说，我们重点学习的是以下几点：
1. 数组的定义和应用
2. 数组元素的索引选取
3. 数组的计算
4. 线性代数的运行计算

Arrays

array用来存储同类型的序列数据，能够被非负整数进行索引。维度的数量就是array的秩(rank)。

我们可以通过python的列表来创建array,并且通过方括号进行索引获取元素

import numpy as np 
a = np.array([1,3,4,6,10])

print(a)

print(a.size)
print(a.shape)
print(a[2])

[ 1  3  4  6 10]
5
(5,)
4

# 二维数组

b = np.array([[[1,2,3,4],[5,6,7,8]]])
print(b.shape)
#b[0,1]

(1, 2, 4)

创建Array

numpy提供了内置的函数来创建一些特殊的数组

np.zeros(3)

array([ 0.,  0.,  0.])

np.ones([3,3])

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

b.shape

(1, 2, 4)

np.zeros_like(b)

array([[[0, 0, 0, 0],
        [0, 0, 0, 0]]])

np.eye(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

Array的常用属性和方法

统计计算
排序
按照大小查索引
条件查找
shape

a = np.random.rand(3,4)
a.shape

(3, 4)

a.size

len(a)

array([[ 0.36963134,  0.12590815,  0.52912576,  0.38604634],
       [ 0.98066039,  0.93271032,  0.30694261,  0.58081517],
       [ 0.85971519,  0.89180773,  0.39815457,  0.73372857]])

np.sum(a)
np.sum(a,axis = 1)
np.sum(a,axis = 0)

array([ 2.21000691,  1.95042619,  1.23422294,  1.70059008])

np.mean(a)
np.std(a)

0.27852909235886786

array([[ 0.36963134,  0.12590815,  0.52912576,  0.38604634],
       [ 0.98066039,  0.93271032,  0.30694261,  0.58081517],
       [ 0.85971519,  0.89180773,  0.39815457,  0.73372857]])

# 排序
np.sort(a,axis = 1)

array([[ 0.12590815,  0.36963134,  0.38604634,  0.52912576],
       [ 0.30694261,  0.58081517,  0.93271032,  0.98066039],
       [ 0.39815457,  0.73372857,  0.85971519,  0.89180773]])

# Returns the indices that would sort this array.
a.argsort()

array([[1, 0, 3, 2],
       [2, 3, 1, 0],
       [2, 3, 0, 1]], dtype=int64)

array([[ 0.36963134,  0.12590815,  0.52912576,  0.38604634],
       [ 0.98066039,  0.93271032,  0.30694261,  0.58081517],
       [ 0.85971519,  0.89180773,  0.39815457,  0.73372857]])

# Returns the indices of the maximum values along an axis.
np.argmax(a,axis = 1)

array([2, 0, 1], dtype=int64)

np.max(a,axis = 1)

array([ 0.52912576,  0.98066039,  0.89180773])

array([[ 0.36963134,  0.12590815,  0.52912576,  0.38604634],
       [ 0.98066039,  0.93271032,  0.30694261,  0.58081517],
       [ 0.85971519,  0.89180773,  0.39815457,  0.73372857]])

# Return elements, either from `x` or `y`, depending on `condition`.
# If only `condition` is given, return ``condition.nonzero()``
np.where(a>0.5)

(array([0, 1, 1, 1, 2, 2, 2], dtype=int64),
 array([2, 0, 1, 3, 0, 1, 3], dtype=int64))

随机数

numpy可以根据一定的规则创建随机数，随机数的使用会在后面概率论，数据挖掘的时候经常用到。

官方主页RANDOM

常用的一些方法：

rand(d0, d1, …, dn) Random values in a given shape.
randn(d0, d1, …, dn) Return a sample (or samples) from the “standard normal” distribution.
randint(low[, high, size, dtype]) Return random integers from low (inclusive) to high (exclusive).
random([size]) Return random floats in the half-open interval [0.0, 1.0).
sample([size]) Return random floats in the half-open interval [0.0, 1.0).
choice(a[, size, replace, p]) Generates a random sample from a given 1-D array

np.random.rand(10)
np.random.rand(3,4)

array([[ 0.66871582,  0.41359784,  0.06186174,  0.91262814],
       [ 0.10415888,  0.74117872,  0.28998329,  0.73763488],
       [ 0.76904933,  0.92487812,  0.9111976 ,  0.00709124]])

np.random.randn(10)

array([ 2.45296079,  1.59713311,  0.84757927,  0.27085421,  0.62772085,
       -0.02441075, -1.79474675, -0.869072  , -0.74012579,  0.34411744])

np.random.randint(10)
np.random.randint(1,10,size = (3,4))

array([[4, 6, 5, 8],
       [8, 6, 7, 5],
       [6, 7, 4, 1]])

np.random.random((2,2))

array([[ 0.75932906,  0.71121568],
       [ 0.90087898,  0.48370479]])

np.random.choice(10,(3,4))

array([[5, 7, 2, 6],
       [1, 5, 3, 1],
       [8, 1, 8, 7]])

np.random.choice([1,4,5,7.08],(3,4))

array([[ 5.  ,  7.08,  7.08,  4.  ],
       [ 4.  ,  4.  ,  7.08,  4.  ],
       [ 7.08,  7.08,  7.08,  7.08]])

?np.random.choice

数组的索引

切片选取类似于list，但是array可以是多维度的，因此我们需要指定每一个维度上的操作

a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]]) # 2维数组，shape = 3 * 4

a[1:3,0:1]
#a[:,:1]

array([[5],
       [9]])

整数索引

a[[1,2],[0,1]]

array([ 5, 10])

布尔型索引

a >4
a[a>4]

array([ 5,  6,  7,  8,  9, 10, 11, 12])

图解索引

数组数学

a = np.random.random([3,4])
b = np.random.random([3,4])

array([[ 0.76004347,  0.4258463 ,  0.08326275,  0.93285095],
       [ 0.87100438,  0.89512213,  0.66405053,  0.37225536],
       [ 0.59545034,  0.41663924,  0.51195997,  0.77346328]])

a + 2

array([[ 2.48043437,  2.3062315 ,  2.37038885,  2.24346901],
       [ 2.57617282,  2.45257504,  2.59148344,  2.9932576 ],
       [ 2.01946187,  2.9662433 ,  2.59164076,  2.89874224]])

a * 10

array([[ 7.60043471,  4.25846302,  0.83262751,  9.32850955],
       [ 8.71004379,  8.95122127,  6.64050534,  3.72255362],
       [ 5.95450342,  4.16639238,  5.11959969,  7.73463281]])

array([[ 0.56674767,  0.83059901,  0.08406071,  0.60134785],
       [ 0.68305575,  0.85945331,  0.50625002,  0.65044408],
       [ 0.00539243,  0.39640508,  0.43254736,  0.94011285]])

# Elementwise
a  + b
a - b
a * b
a / b

array([[   1.34106147,    0.51269782,    0.99050729,    1.5512668 ],
       [   1.27515856,    1.04150175,    1.31170472,    0.57230955],
       [ 110.42337422,    1.05104414,    1.18359286,    0.82273451]])

# Elementwise
np.add(a,b)
np.subtract(a,b)
np.multiply(a,b)
np.divide(a,b)

array([[ 0.720169  ,  0.67768136,  1.59930077,  0.40340647],
       [ 0.78913739,  0.55002961,  0.7516446 ,  0.42565485],
       [ 1.71390674,  0.76294821,  3.74288752,  0.50604784]])

*是元素力度的计算(Elementwise),并不是矩阵计算。我们使用dot函数进行内积求解

# shape(a) = 3*4  shape(b.T) = 4*3
a.dot(b.T) # (3*4) * (4*3) = 3 * 3
np.dot(a,b.T)

array([[ 1.35242743,  1.53406623,  1.08590637],
       [ 1.51680278,  1.94256711,  0.99672314],
       [ 1.19168644,  1.52708211,  1.11695854]])

线性代数

numpy和scipy可以进行线性代数的计算，但是我们目前还没补充线性代数知识。因此这一章节我们会挪动到线性代数理论知识章节进行讲解！

潘诺西亚的火山

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
NUMPY练习

numpy 是 python科学计算的核心库。PYTHON里涉及到科学计算的包括Pandas,sklearn等都是基于numpy进行二次开发包装的。numpy功能非常强大，其余scipy构建了强大的PYTHON数理计算功能，函数接口丰富复杂。对于本次课程来说，我们重点学习的是以下几点：1. 数组的定义和应用2. 数组元素的索引选取3. 数组的计算4. 线性代数的运行计算Arraysarray用来存储同类型的序列数据，能够被非负整数进行索引。维度的数量就是array的秩(rank)。我们可
复制链接

扫一扫