numpy详解，小艾笔记

算法程序猿

已于 2023-03-26 10:50:29 修改

阅读量6.7k

点赞数 3

文章标签： python 人工智能

于 2022-04-14 15:19:31 首次发布

本文链接：https://blog.csdn.net/m0_47533197/article/details/124171372

版权

一个在python中做科学计算的基础库，重在数值计算、也是大部分python科学计算库的基础库，多用于大型，多维数组上的科学运算

十四、np.random.shuffle函数

十五、标准化

十六、np.sum(axis=0/1/2/......n)

axis=0求和过程展示：

axis=1求和过程展示：

axis=2求和过程展示：

十七、np.outer() 、np.dot()、np.mutiply()、*

2.np.random.randn()函数

3. np.random.permutation

二十一、np.empty函数

二十三、np.argmin函数

一、numpy中常见的数据类型

二、array

array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
          like=None)

参数说明：

名称	描述
object	数组或嵌套的数列
dtype	数组元素的数据类型，可选
copy	对象是否需要复制，可选
order	创建数组的样式，C为行方向，F为列方向，A为任意方向（默认）
subok	默认返回一个与基类类型一致的数组
ndmin	指定生成数组的最小维度

三、astype

a.astype(dtype, order='K', casting='unsafe', subok=True, copy=True)

参数说明：

和上边array函数参数功能一样

四、shape

a.shape查看数组的形状


import numpy as np

a = np.array([3,4,1,5],order="F")

print(a)
print(a.shape)

五、reshape

Give a new shape to the array without changing its data.

Parameters
----------
shape : int or tuple of ints
    The new shape should be compatible with the original shape. If an
    integer is supplied, then the result will be a 1-D array of that
    length.
order : {'C', 'F'}, optional
    Determines whether the array data should be viewed as in C
    (row-major) or FORTRAN (column-major) order.

Returns
-------
reshaped_array : array
    A new view on the array.

import numpy as np

a = np.arange(24).reshape((2,3,4))

print(a)
print(a.shape)

六、flatten

将一个numpy类型"变平"

import numpy as np

a = np.arange(24).reshape((2,3,4))

print(a)
print(a.shape)

print(a.flatten())

七、加数运算

给numpy类型中的每个元素都进行这个运算，乘、除、减也一样

import numpy as np

a = np.arange(24).reshape((2,3,4))
print(a)
a+=3
print(a)

八、广播原则

如果两个数组的后缀维度(trailing dimenstion，即从末尾开始算起的维度)的轴长相符或其中一方的长度为1，则认为它们是广播兼容的。广播会在缺失和长度为1的维度上进行。

九、读取数据

CSV：Comma-Separated Value，逗号分割值文件

显示：表格状态

源文件：换行和逗号分割行列的格式化文本，每一行的数据表示一条记录

1.loadtxt

函数原型
loadtxt(fname, dtype=float, comments='#', delimiter=None,
            converters=None, skiprows=0, usecols=None, unpack=False,
            ndmin=0, encoding='bytes', max_rows=None, *, like=None):
    r"""
    Load data from a text file.

    Each row in the text file must have the same number of values.

   Parameters
    ----------
    fname : file, str, or pathlib.Path
        File, filename, or generator to read.  If the filename extension is
        ``.gz`` or ``.bz2``, the file is first decompressed. Note that
        generators should return byte strings.
    dtype : data-type, optional
        Data-type of the resulting array; default: float.  If this is a
        structured data-type, the resulting array will be 1-dimensional, and
        each row will be interpreted as an element of the array.  In this
        case, the number of columns used must match the number of fields in
        the data-type.
    comments : str or sequence of str, optional
        The characters or list of characters used to indicate the start of a
        comment. None implies no comments. For backwards compatibility, byte
        strings will be decoded as 'latin1'. The default is '#'.
    delimiter : str, optional
        The string used to separate values. For backwards compatibility, byte
        strings will be decoded as 'latin1'. The default is whitespace.
    converters : dict, optional
        A dictionary mapping column number to a function that will parse the
        column string into the desired value.  E.g., if column 0 is a date
        string: ``converters = {0: datestr2num}``.  Converters can also be
        used to provide a default value for missing data (but see also
        `genfromtxt`): ``converters = {3: lambda s: float(s.strip() or 0)}``.
        Default: None.
    skiprows : int, optional
        Skip the first `skiprows` lines, including comments; default: 0.
    usecols : int or sequence, optional
        Which columns to read, with 0 being the first. For example,
        ``usecols = (1,4,5)`` will extract the 2nd, 5th and 6th columns.
        The default, None, results in all columns being read.

        .. versionchanged:: 1.11.0
            When a single column has to be read it is possible to use
            an integer instead of a tuple. E.g ``usecols = 3`` reads the
            fourth column the same way as ``usecols = (3,)`` would.
    unpack : bool, optional，转置
        If True, the returned array is transposed, so that arguments may be
        unpacked using ``x, y, z = loadtxt(...)``.  When used with a
        structured data-type, arrays are returned for each field.
        Default is False.
    ndmin : int, optional
        The returned array will have at least `ndmin` dimensions.
        Otherwise mono-dimensional axes will be squeezed.
        Legal values: 0 (default), 1 or 2.

        .. versionadded:: 1.6.0
    encoding : str, optional
        Encoding used to decode the inputfile. Does not apply to input streams.
        The special value 'bytes' enables backward compatibility workarounds
        that ensures you receive byte arrays as results if possible and passes
        'latin1' encoded strings to converters. Override this value to receive
        unicode arrays and pass strings as input to converters.  If set to None
        the system default is used. The default value is 'bytes'.

        .. versionadded:: 1.14.0
    max_rows : int, optional
        Read `max_rows` lines of content after `skiprows` lines. The default
        is to read all the lines.

        .. versionadded:: 1.16.0
    ${ARRAY_FUNCTION_LIKE}

        .. versionadded:: 1.20.0

    Returns
    -------
    out : ndarray
        Data read from the text file.

import numpy as np

file_path = "./trial.csv"

t1 = np.loadtxt(file_path,delimiter=',',dtype='int',unpack=False)

print(t1)

十、转置

1.transpose

2.arr.T

3.swapaxes(坐标轴，坐标轴)

import numpy as np

arr = np.arange(24).reshape(4,6)
print(arr,end="\n\n")

print(arr.transpose()) 
#或者print(arr.T) 效果一样
#或者print(arr.swapaxes(1,0))

十一、numpy索引和切片

1.列表切片
切片操作基本表达式：[start_index：stop_index：step] start 值:
（1）start_index，如果没有指定，则默认开始值为 0；
（2）stop_index 值: 指示到哪个索引值结束，但不包括这个结束索引值。如果没有指定，则取列表允许的最大索引值（即list.length）；
（3）step 值: 步长值指示每一步大小，如果没有指定，则默认步长值为 1。
（4）当 step>0，start_index 的空值下标为 0，stop_index 为空时，值下标为list.length，step 的方向是左到右；
（5）当 step<0，start_index 的空值下标为list.length，stop_index 的空值下标为 0，此时反向为右到左
三个值都是可选的，非必填

十二、dot函数

格式：x.dot(y) 等价于 np.dot(x,y) ———x是m*n 矩阵，y是n*m矩阵，则x.dot(y) 得到m*m矩阵。

1.向量点积

一位数组相乘，得到 $a1 * b1 + a2 * b2 + .... + an * bn$

import numpy as np
x=np.array([0,1,2,3,4])#等价于:x=np.arange(0,5)
y=x[::-1]
print(x)
print(y)
print(np.dot(x,y))

输出：

[0 1 2 3 4]
[4 3 2 1 0]
10

2.矩阵乘法

import numpy as np
x=np.arange(0,5)
y=np.random.randint(0,10,size=(5,1))
print(x)
print(y)
print("x.shape:",x.shape)
print("y.shape",y.shape)
print(np.dot(x,y))

输出：

[0 1 2 3 4]
[[3]
 [7]
 [2]
 [8]
 [1]]
x.shape:(5,)
y.shape(5, 1)
[39]

十三、mean函数

mean() 函数定义：
numpy.mean(a, axis, dtype, out，keepdims )

mean()函数功能：求取均值
经常操作的参数为axis，以m * n矩阵举例：

axis 不设置值，对 m*n 个数求均值，返回一个实数
axis = 0：对各列求均值，返回 1* n 矩阵
axis =1：对各行求均值，返回 m *1 矩阵

1.数组操作

>>> a = np.array([[1, 2], [3, 4]])
>>> a
array([[1, 2],
       [3, 4]])
>>> np.mean(a)
2.5
>>> np.mean(a, axis=0) # axis=0，计算每一列的均值
array([ 2.,  3.])
>>> np.mean(a, axis=1) # 计算每一行的均值 
array([ 1.5,  3.5])
>>>

2.矩阵操作

>>> import numpy as np
>>> num1 = np.array([[1,2,3],[2,3,4],[3,4,5],[4,5,6]])
>>> num1
array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5],
       [4, 5, 6]])
>>> num2 = np.mat(num1)
>>> num2
matrix([[1, 2, 3],
        [2, 3, 4],
        [3, 4, 5],
        [4, 5, 6]])
>>> np.mean(num2) # 对所有元素求均值
3.5
>>> np.mean(num2,0) # 压缩行，对各列求均值
matrix([[ 2.5,  3.5,  4.5]])
>>> np.mean(num2,1) # 压缩列，对各行求均值
matrix([[ 2.],
        [ 3.],
        [ 4.],
        [ 5.]])
>>>

十四、np.random.shuffle函数

功能：修改本身，打乱顺序

import numpy as np
arr = np.array(range(0, 21, 2))
np.random.shuffle(arr)
print(arr)

arr = np.array(range(12)).reshape(3, 4)
np.random.shuffle(arr)
print(arrr)

输出：

十五、标准化

np.linalg.norm()　　　　# linalg = linear(线性) + algebra(代数)， norm表示范数

x_norm = np.linalg.norm(x, ord=None, axis=None, keepdims=False)

①x: 表示矩阵（也可以是一维）

②ord：范数类型

向量的范数：

矩阵的范数：

ord=1：列和的最大值

ord=2：|λE-ATA|=0，求特征值，然后求最大特征值得算术平方根

ord=∞：行和的最大值

ord=None：默认情况下，是求整体的矩阵元素平方和，再开根号。（注意.None不是求2范数）

③axis：处理类型

axis=1表示按行向量处理，求多个行向量的范数

axis=0表示按列向量处理，求多个列向量的范数

axis=None表示矩阵范数。

④keepdims：是否保持矩阵的二维特性，避免出现shape = (5, )这样的形状

True表示保持矩阵的二维特性，False相反

十六、np.sum(axis=0/1/2/......n)

import numpy as np
#0-27,步长为1 的整数
n = np.arange(0, 27, 1)
# 生成3行3列3层的矩阵
n = n.reshape(3,3,3)
#最外层求和
a = n.sum(axis=0)
#中间层求和
b = n.sum(axis=1)
#最内层求和
c = n.sum(axis=2)

axis=0求和过程展示：

  n: [[[ 0  1  2][ 3  4  5][ 6  7  8]]
            +         +         +
      [[ 9 10 11][12 13 14][15 16 17]]
            +         +         +
      [[18 19 20][21 22 23][24 25 26]]]
            =         =         =
  a:  [[27 30 33][36 39 42][45 48 51]]

axis=1求和过程展示：

  n: [[[ 0  1  2] + [ 3  4  5] + [ 6  7  8]]
      [[ 9 10 11] + [12 13 14] + [15 16 17]]
      [[18 19 20] + [21 22 23] + [24 25 26]]]
  b: [[ 9 12 15]
      [36 39 42]
      [63 66 69]]

axis=2求和过程展示：

  n: [[[ 0 + 1 + 2] + [ 3 + 4 + 5] + [ 6 + 7 + 8]]
      [[ 9 +10 +11] + [12 +13 +14] + [15 +16 +17]]
      [[18 +19 +20] + [21 +22 +23] + [24 +25 +26]]]
  c:  [[     3             12             21    ]
       [    30             39             48    ]
       [    57             66             75    ]]

十七、np.outer() 、np.dot()、np.mutiply()、*

np.outer()表示的是两个向量相乘，拿第一个向量的元素分别与第二个向量所有元素相乘得到结果的一行。

np.dot()如果碰到的是秩为1的数组，那么执行的是对应位置的元素相乘再相加;如果遇到的是秩不为1的数组，那么执行的是矩阵相乘。但是需要注意的是矩阵与矩阵相乘是秩为2，矩阵和向量相乘秩为1。点积

np.multiply()表示的是数组和矩阵对应位置相乘，输入和输出的结果shape一致。

*对数组执行的是对应位置相乘，对矩阵执行的是矩阵相乘。

import numpy as np
 
print("下面我们将讨论一些关于一维数组的乘法的问题")
A=np.array([1,2,3])
B=np.array([2,3,4])
c=[1,2,3]
print("*:",A*B)#对数组执行的是对应位置元素相乘
print("np.dot():",np.dot(A,B))#当dot遇到佚为1，执行按位乘并相加
print("np.multiply():",np.multiply(A,B))#对数组执行的是对应位置的元素相乘
print("np.outer():",np.outer(A,B))#A的一个元素和B的元素相乘的到结果的一行
 
print("下面我们将讨论一些关于二维数组和二位数组的乘法的问题")
a=np.array([[1,2,3],[3,4,5]])
b=np.array([[1,1],[2,2],[3,3]])
c=np.array([[2,2,2],[3,3,3]])
#出错：维度不对应：print("*:",a*b)
print("*:",a*c)#*对数组执行的是对应位置相乘
print("np.dot():",np.dot(a,b))#当dot遇到佚不为1执行矩阵的乘法（2，3）×（3,2）=（2,2）
#出错，维度不对应：print("np.multiply():",np.multiply(a,b))
print("np.multiply():",np.multiply(a,c))#数组或者矩阵对应位置元素相乘，返回的是与原数组或者矩阵的大小一致
 
print("下面我们将讨论一些关于矩阵的乘法的问题")
A=np.mat([[1,2,3],[3,4,5]])
B=np.mat([[1,1],[2,2],[3,3]])
C=np.mat([[2,2,2],[3,3,3]])
D=[1,2,3]
print("*:",A*B)#*对矩阵执行的是矩阵相乘
print("np.dot():",np.dot(A,B))#dot对矩阵执行的是矩阵相乘
print("np.dot():",np.dot(A,D))
#这里可以看出矩阵和矩阵的相相乘是轶为2的，所以是执行的矩阵乘法，但是矩阵和向量相乘是轶为1的，执行的是对应相乘加和
print("np.multiply():",np.multiply(A,C))#multiply执行的是矩阵对应元素相乘

下面我们将讨论一些关于一维数组的乘法的问题
*: [ 2 6 12]
np.dot(): 20
np.multiply(): [ 2 6 12]
np.outer(): [[ 2 3 4]
[ 4 6 8]
[ 6 9 12]]
下面我们将讨论一些关于二维数组和二位数组的乘法的问题
*: [[ 2 4 6]
[ 9 12 15]]
np.dot(): [[14 14]
[26 26]]
np.multiply(): [[ 2 4 6]
[ 9 12 15]]
下面我们将讨论一些关于矩阵的乘法的问题
*: [[14 14]
[26 26]]
np.dot(): [[14 14]
[26 26]]
np.dot(): [[14 26]]
np.multiply(): [[ 2 4 6]
[ 9 12 15]]

十八、np.absolute

Numpy通用的绝对值函数是np.absolute，也可以用其别名来访问np.abs。这个通用函数也可以处理复数，处理复数时，绝对值返回的是该复数的模。

x = np.array([-2, -1, 0, 1, 2])
abs(x)
# array([2, 1, 0, 1, 2])
 
 
np.absolute(x)
#  array([2, 1, 0, 1, 2])
np.abs(x)
#  array([2, 1, 0, 1, 2])
 
 
x = np.array([3 - 4j, 4 - 3j, 2 + 0j, 0 + 1j])
np.abs(x)
# array([ 5.,  5.,  2.,  1.])

十九、numpy.square()

平方函数

二十、np.random

1.np.random.rand()函数

语法：np.random.rand(d0,d1,d2……dn)
注意：使用方法与np.random.randn()函数相同。
作用：通过本函数可以返回一个或一组服从“0~1”均匀分布的随机样本值。随机样本取值范围是[0,1)，不包括1。
应用：在深度学习的Dropout正则化方法中，可以用于生成dropout随机向量（dl）
例如：keep_prob表示保留神经元的比例：dl = np.random.rand(al.shape[0],al.shape[1]) < keep_prob

keep_prob = 0.8
dl = np.random.rand(2,3) < keep_prob
print(dl)
aa = np.random.rand(2,3)
print(aa)

#正规化抛弃，有80%概率保留，20%概率丢弃
a3 = np.array([[1,2,3],[4,5,6]])
a3 = np.multiply(a3,dl)
print(a3)

a3 /= keep_prob #确保a3的期望值不变

2.np.random.randn()函数

语法：np.random.randn(d0,d1,d2……dn)
1. 当函数括号内没有参数时，则返回一个浮点数；
2. 当函数括号内有一个参数时，则返回秩为1的数组，不能表示向量和矩阵；
3. 当函数括号内有两个及以上参数时，则返回对应维度的数组，能表示向量或矩阵；
4. np.random.standard_normal()函数与np.random.randn()类似，但是np.random.standard_normal()的输入参数为元组(tuple)。
5. np.random.randn()的输入通常为整数，但是如果为浮点数，则会自动直接截断转换为整数。
作用：通过本函数可以返回一个或一组服从标准正态分布的随机样本值。
特点：标准正态分布是以0为均数、以1为标准差的正态分布，记为N（0，1）。对应的正态分布曲线如下所示，即

标准正态分布曲线下面积分布规律是：在-1.96～+1.96范围内曲线下的面积等于0.9500(即取值在这个范围的概率为95%)，在-2.58～+2.58范围内曲线下面积为0.9900(即取值在这个范围的概率为99%)。
因此：由np.random.randn()函数所产生的随机样本基本上取值主要在-1.96~+1.96之间，当然也不排除存在较大值的情形，只是概率较小而已。

3. np.random.permutation

此函数只能针对一维数据随机排列,对于多维数据只能对第一维度的数据进行随机排列

import numpy as np

data = np.array([1,2,3,4,5,6,7])
a = np.random.permutation(data)
b = np.random.permutation([5,0,9,0,1,1,1])
print(a)
print( "data:", data )
print(b)

print(np.random.permutation(5))
# 0 - 5数字随机排列

二十一、np.empty函数

用来创建一个空的多维数组

二十三、np.argmin函数

np.argmin()求最小值对应的索引，np.argmax()求最大值对应的索引

def argmin(a, axis=None, out=None):
"""
Returns the indices of the minimum values along an axis.
Parameters
----------
a : array_like
Input array.
axis : int, optional
By default, the index is into the flattened array, otherwise
along the specified axis.
out : array, optional
If provided, the result will be inserted into this array. It should
be of the appropriate shape and dtype.

如果对于array类型的变量a，在应用np.argmin(a)时返回值仅为一个数，返回的是a中所有元素的最小值在平铺a之后的序列中的位置。