【Numpy】1. n维数组，dtype，切片，索引

ProphetMo__

已于 2022-01-28 00:55:54 修改

阅读量2k

点赞数

分类专栏：数据科学文章标签： python 线性代数机器学习源码 numpy

于 2022-01-28 00:54:49 首次发布

本文链接：https://blog.csdn.net/u014568597/article/details/122725871

版权

数据科学专栏收录该内容

1 篇文章 0 订阅

订阅专栏

【NumPy】1. n维数组，dtype，切片，索引

NumPy是python的一个第三方库，全称"Numeric Python"。他可以执行数组的算数和逻辑运算、线性代数等多方面操作，如何安装这个库，这里就不说了，之前数学建模比赛的时候也看过这个库，但是学的都比较粗糙，这次比较系统的了解下这个包，先说说NumPy中最重要的一个对象——n维数组

本文所有numpy包导入统一命名为np

import numpy as np

1. Ndarray 对象

全称为 N-dimensional array，译为n维数组，它是相同类型的元素集合

ndarray类的实例可以通过许多不同的数组创建来构造，基本的ndarray是使用NumPy中的数组函数创建的，他的构造方法如下

np.array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0, like=None)

参数

参数	特征属性	特征描述
object	array_like	一个数组、任何暴露数组接口方法的对象、对象的__array__()方法返回数组的对象、任何（嵌套）序列【最后这个我没看懂啥意思】
dtype	data-type, optional	数组的数据属性，如果没有给出，那么数据类型(dtype)将会根据数组序列对象所需的最小空间进行自动转换
copy	bool, optional	如果为True（默认），那么这个对象是已被拷贝的（？？）。这个没弄明白，大佬知道的可以在评论区里踢我一脚
order	{‘K’, ‘A’, ‘C’, ‘F’}, optional	用于声明数组内存的排列规则（即怎么存储元素），C：按行排列(C Order) F：按列排列(Fortan Order) A：任意(由输入的数据排列规则定)， K（不懂）
subok	bool, optional	默认情况下(False)返回数组会被强制转为基类数组，如果为True返回子类
ndmin	int, optional	指定返回数组的最小维数
like	array_like	该参数可以让数组创建出来的对象不一定是NumPy arrays，让返回的对象为like关键字指定的类似数组的(array_like)对象.

上面是我自己的见解，想要更好的了解，下面这是源码（笑）

array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
              like=None)
  
'''
        Create an array.
  
        Parameters
        ----------
        object : array_like
            An array, any object exposing the array interface, an object whose
            __array__ method returns an array, or any (nested) sequence.
        dtype : data-type, optional
            The desired data-type for the array.  If not given, then the type will
            be determined as the minimum type required to hold the objects in the
            sequence.
        copy : bool, optional
            If true (default), then the object is copied.  Otherwise, a copy will
            only be made if __array__ returns a copy, if obj is a nested sequence,
            or if a copy is needed to satisfy any of the other requirements
            (`dtype`, `order`, etc.).
        order : {'K', 'A', 'C', 'F'}, optional
            Specify the memory layout of the array. If object is not an array, the
            newly created array will be in C order (row major) unless 'F' is
            specified, in which case it will be in Fortran order (column major).
            If object is an array the following holds.
  
            ===== ========= ===================================================
            order  no copy                     copy=True
            ===== ========= ===================================================
            'K'   unchanged F & C order preserved, otherwise most similar order
            'A'   unchanged F order if input is F and not C, otherwise C order
            'C'   C order   C order
            'F'   F order   F order
            ===== ========= ===================================================
  
            When ``copy=False`` and a copy is made for other reasons, the result is
            the same as if ``copy=True``, with some exceptions for 'A', see the
            Notes section. The default order is 'K'.
        subok : bool, optional
            If True, then sub-classes will be passed-through, otherwise
            the returned array will be forced to be a base-class array (default).
        ndmin : int, optional
            Specifies the minimum number of dimensions that the resulting
            array should have.  Ones will be pre-pended to the shape as
            needed to meet this requirement.
        like : array_like
            Reference object to allow the creation of arrays which are not
            NumPy arrays. If an array-like passed in as ``like`` supports
            the ``__array_function__`` protocol, the result will be defined
            by it. In this case, it ensures the creation of an array object
            compatible with that passed in via this argument.
'''

example

x = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]], dtype=np.intc)
print(x)
'''
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
 '''

x1 = np.array([1, 0, True, False, np.True_, np.False_], dtype=np.bool_)
print(x1)
'''
[ True False  True False  True False]
'''

x2 = np.array([1, 2, 3, 4], dtype=np.float_)
print(x2)
'''
[1. 2. 3. 4.]
'''

x3 = np.array([1, 2, 3, 4], dtype=np.complex_)
print(x3)
'''
[1.+0.j 2.+0.j 3.+0.j 4.+0.j]
'''

这个试验我们也可以发现，numpy库不仅有许多的数据类型，也可以兼容python内置类型

ndarray数组可以通过很多方法去进行构造，比较常用的有linspace()（line space，线性空间），logspace()（指数空间），eye()（单位矩阵），zero()（零矩阵）还有许多的方法都可以构造出ndarray对象

2. 数据类型-Dtype

刚刚的例子里面就用了许多NumPy的标量数据类型

数据类型	描述
bool_	存储一个字节的布尔值
int_	默认整数，相当于C的long，通常为int32或int64
intc	相当于C的int，通常为int32或int64
intp	用于索引的整数，相当于C的size_t，通常为int32或int64
int8	8位整数（-128~127）
int16	16位整数（-32768~32767）
int32	32位整数（-2147483648~2147483647）
uint8	8位无符号整数（0~255）
uint16	16位无符号整数（0 ~ 65535）
uint32	32位无符号整数（0 ~ 4294967295）
uint64	64位无符号整数（0 ~ 18446744073709551615）
float_	float64的简写
float16	半精度浮点：符号位，5位指数，10位尾数
float32	单精度浮点：符号位，8位指数，23位尾数
float64	双精度浮点：符号位，11位指数，52位尾数
complex_	complex128的简写
complex64	复数，由两个64位浮点表示（实部和虚部）
complex128	复数，由两个64位浮点表示（实部和虚部）

这些数据类型都是dtype实例对象的唯一特征

# 使用数组标量类型  
import numpy as np 
dt = np.dtype(np.int32)  
print(dt)
'''
int32
'''

3. 切片与索引

3.1 基础切片介绍

ndarray对象的内容可以通过索引或者切片的方式去进行修改，这也是我之前接触python觉得特别魔法的一个特点之一，不过知道它是什么个原理之后，觉得也不是特别神奇了，不过确实方便。

ndarray对象中的元素遵循基于零的索引。

有三种可用的索引方法型：字段访问，基本切片和高级索引

3.2 切片格式

先讲讲python中的切片

对于一个python内置容器（如str类），用切片的方式访问它内部序列元素，可以遵循下面这个格式

str1[start:end:step]

start：开始元素下标，若缺省则默认为容器第一个元素

end：结束元素下表，若缺省则默认为容器最后一个元素

step：步长值（就是每隔几个元素就取出来，在a1、a2、a3、a4的序列中，a1与a2的步长值是1，a1与a3的步长值为3），若缺省则默认为1

注：start不一定要比end小，step也不一定为正

如若第一个start参数不写，但要写第二个end参数或step参数，必须要写出":"符号用以识别参数类型，详细演示可以看例子，我嘴笨

两个数之间必须要有":"间隔

不能同时写三个":"

a[::]        正确
a[:::]       错误
a[:5]        正确
a[:5:]       正确
a[::2]       正确
a[2::1]      正确
a[9:2:2]     正确
a[9:2:-1]    正确

example

import numpy as np
a = np.array(range(0, 10))

print(a[::])
"""[0 1 2 3 4 5 6 7 8 9]"""
print(a[5:])
"""[5 6 7 8 9]"""
print(a[5:6])
"""[5]"""
print(a[:6])
"""[0 1 2 3 4 5]"""
print(a[::2])
"""[0 2 4 6 8]"""
print(a[2:7:2])
"""[2 4 6]"""
print(a[8:3:-1])
"""[8 7 6 5 4]"""

上面的描述也可以适用于多维的ndarray，下面是二维数组

import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print(a)
# 对始于索引的元素进行切片
print('现在我们从索引 a[1:] 开始对数组切片')
print(a[1:])

[[1 2 3]
 [3 4 5]
 [4 5 6]]
现在我们从索引 a[1:] 开始对数组切片
[[3 4 5]
 [4 5 6]]

切片还可以包括省略号（…）,用于切片表示该维度的所有元素，如果在行位置使用"…"，那么则表示所有行的元素都被选中了，比如说一个3*3的矩阵 […, 1]意味着选中了第二列的所有元素（行和列都可以切片表示哦如[0:2, 1:3]）

import numpy as np

a = np.array([[1, 2, 3], [3, 4, 5], [4, 5, 6]])
print('我们的数组是：')
print(a)
# 这会返回第二列元素的数组：  
print('第二列的元素是：')
print(a[..., 1])
# 现在我们从第二行切片所有元素：  
print('第二行的元素是：')
print(a[1, ...])
# 现在我们从第二列向后切片所有元素：
print('第二列及其剩余元素是：')
print(a[..., 1:])

我们的数组是：
[[1 2 3]
 [3 4 5]
 [4 5 6]]
第二列的元素是：
[2 4 5]
第二行的元素是：
[3 4 5]
第二列及其剩余元素是：
[[2 3]
 [4 5]
 [5 6]]

4. 高级索引

如果一个ndarray是非元组序列，数据类型为整数或布尔值的ndarray，或者至少一个元素为序列对象的元组，我们就能够用它来索引ndarray。高级索引始终返回数据的副本。与此相反，切片只提供了一个视图。

有两种类型的高级索引：整数和布尔值。

4.1 整数索引

这种机制有助于基于 N 维索引来获取数组中任意元素。每个整数数组表示该维度的下标值。当索引的元素个数就是目标ndarray的维度时，会变得相当直接。

以下示例获取了ndarray对象中每一行指定列的一个元素。因此，行索引包含所有行号，列索引指定要选择的元素。

import numpy as np 

x = np.array([[1,  2],  [3,  4],  [5,  6]]) 
y = x[[0,1,2],  [0,1,0]]  
print(y)
"""
输出了数组中(0,0), (1,1), (2,0)位置处的元素
[1 4 5]
"""

4.2 布尔索引

这个比较容易理解，直接上例子

import numpy as np

a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(a[a > 5])
"""[6 7 8 9]"""

注：那是个布尔表达式，你直接写个True False都行，这意味着你在里面写一些filter()的布尔函数去过筛选你想要的信息都是可以的

ProphetMo__

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
【Numpy】1. n维数组，dtype，切片，索引

【Numpy】1. n维数组，dtype，切片，索引NumPy是python的一个第三方库，全称"Numeric Python"。他可以执行数组的算数和逻辑运算、线性代数等多方面操作，如何安装这个库，这里就不说了，之前数学建模比赛的时候也看过这个库，但是学的都比较粗糙，这次比较系统的了解下这个包，先说说NumPy中最重要的一个对象——n维数组本文所有numpy包导入统一命名为npimport numpy as np1. Ndarray 对象全称为 N-dimensional array，译为n维
复制链接

扫一扫

专栏目录