AI：Numpy的多维数组ndarray的基本使用

程序员无羡

于 2024-04-26 17:18:19 发布

阅读量599

点赞数 25

文章标签： numpy 人工智能

本文链接：https://blog.csdn.net/weixin_45427648/article/details/138224831

版权

本文详细介绍了NumPy库中ndarray对象的属性（如shape、ndim、dtype等），展示了如何理解和操作数组的形状以及数据类型。此外，还涵盖了逻辑运算、通用判断函数（如np.all、np.any）、统计运算（如np.max、np.min等）的使用方法。

摘要由CSDN通过智能技术生成

N维数组-ndarray

1 ndarray的属性

数组属性反映了数组本身固有的信息。

属性名字	属性解释
ndarray.shape	数组维度的元组
ndarray.ndim	数组维数
ndarray.size	数组中的元素数量
ndarray.itemsize	一个数组元素的长度（字节）
ndarray.dtype	数组元素的类型

2 ndarray的形状

首先创建一些数组。

# 创建不同形状的数组
>>> a = np.array([[1,2,3],[4,5,6]])
>>> b = np.array([1,2,3,4])
>>> c = np.array([[[1,2,3],[4,5,6]],[[1,2,3],[4,5,6]]])

分别打印出形状

>>> a.shape
>>> b.shape
>>> c.shape

(2, 3)  # 二维数组
(4,)	# 一维数组
(2, 2, 3) # 三维数组

如何理解数组的形状？

二维数组：

在这里插入图片描述

三维数组：

在这里插入图片描述

3 ndarray的类型

>>> type(score.dtype)

<type 'numpy.dtype'>

dtype是numpy.dtype类型，先看看对于数组来说都有哪些类型

名称	描述	简写
np.bool	用一个字节存储的布尔类型（True或False）	‘b’
np.int8	一个字节大小，-128 至 127	‘i’
np.int16	整数，-32768 至 32767	‘i2’
np.int32	整数，-2^31 至 2^32 -1	‘i4’
np.int64	整数，-2^63 至 2^63 - 1	‘i8’
np.uint8	无符号整数，0 至 255	‘u’
np.uint16	无符号整数，0 至 65535	‘u2’
np.uint32	无符号整数，0 至 2^32 - 1	‘u4’
np.uint64	无符号整数，0 至 2^64 - 1	‘u8’
np.float16	半精度浮点数：16位，正负号1位，指数5位，精度10位	‘f2’
np.float32	单精度浮点数：32位，正负号1位，指数8位，精度23位	‘f4’
np.float64	双精度浮点数：64位，正负号1位，指数11位，精度52位	‘f8’
np.complex64	复数，分别用两个32位浮点数表示实部和虚部	‘c8’
np.complex128	复数，分别用两个64位浮点数表示实部和虚部	‘c16’
np.object_	python对象	‘O’
np.string_	字符串	‘S’
np.unicode_	unicode类型	‘U’

创建数组的时候指定类型

>>> a = np.array([[1, 2, 3],[4, 5, 6]], dtype=np.float32)
>>> a.dtype
dtype('float32')

>>> arr = np.array(['python', 'tensorflow', 'scikit-learn', 'numpy'], dtype = np.string_)
>>> arr
array([b'python', b'tensorflow', b'scikit-learn', b'numpy'], dtype='|S12')

注意：若不指定，整数默认int64，小数默认float64

4 总结

数组的基本属性【知道】

属性名字	属性解释
ndarray.shape	数组维度的元组
ndarray.ndim	数组维数
ndarray.size	数组中的元素数量
ndarray.itemsize	一个数组元素的长度（字节）
ndarray.dtype	数组元素的类型

ndarray运算

问题

如果想要操作符合某一条件的数据，应该怎么做？

1 逻辑运算

# 生成10名同学，5门功课的数据
>>> score = np.random.randint(40, 100, (10, 5))

# 取出最后4名同学的成绩，用于逻辑判断
>>> test_score = score[6:, 0:5]

# 逻辑判断, 如果成绩大于60就标记为True 否则为False
>>> test_score > 60
array([[ True,  True,  True, False,  True],
       [ True,  True,  True, False,  True],
       [ True,  True, False, False,  True],
       [False,  True,  True,  True,  True]])

# BOOL赋值, 将满足条件的设置为指定的值-布尔索引
>>> test_score[test_score > 60] = 1
>>> test_score
array([[ 1,  1,  1, 52,  1],
       [ 1,  1,  1, 59,  1],
       [ 1,  1, 44, 44,  1],
       [59,  1,  1,  1,  1]])

2 通用判断函数

np.all()

# 判断前两名同学的成绩[0:2, :]是否全及格
>>> np.all(score[0:2, :] > 60)
False

np.any()

# 判断前两名同学的成绩[0:2, :]是否有大于90分的
>>> np.any(score[0:2, :] > 80)
True

3 np.where（三元运算符）

通过使用np.where能够进行更加复杂的运算

np.where()

# 判断前四名学生,前四门课程中，成绩中大于60的置为1，否则为0
temp = score[:4, :4]
np.where(temp > 60, 1, 0)

复合逻辑需要结合np.logical_and和np.logical_or使用

# 判断前四名学生,前四门课程中，成绩中大于60且小于90的换为1，否则为0
np.where(np.logical_and(temp > 60, temp < 90), 1, 0)

# 判断前四名学生,前四门课程中，成绩中大于90或小于60的换为1，否则为0
np.where(np.logical_or(temp > 90, temp < 60), 1, 0)

4 统计运算

如果想要知道学生成绩最大的分数，或者做小分数应该怎么做？

4.1 统计指标

在数据挖掘/机器学习领域，统计指标的值也是我们分析问题的一种方式。常用的指标如下：

min(a, axis)
- Return the minimum of an array or minimum along an axis.
max(a, axis])
- Return the maximum of an array or maximum along an axis.
median(a, axis)
- Compute the median along the specified axis.
mean(a, axis, dtype)
- Compute the arithmetic mean along the specified axis.
std(a, axis, dtype)
- Compute the standard deviation along the specified axis.
var(a, axis, dtype)
- Compute the variance along the specified axis.

4.2 案例：学生成绩统计运算

进行统计的时候，axis 轴的取值并不一定，Numpy中不同的API轴的值都不一样，在这里，axis 0代表列, axis 1代表行去进行统计

# 接下来对于前四名学生,进行一些统计运算
# 指定列 去统计
temp = score[:4, 0:5]
print("前四名学生,各科成绩的最大分：{}".format(np.max(temp, axis=0)))
print("前四名学生,各科成绩的最小分：{}".format(np.min(temp, axis=0)))
print("前四名学生,各科成绩波动情况：{}".format(np.std(temp, axis=0)))
print("前四名学生,各科成绩的平均分：{}".format(np.mean(temp, axis=0)))

结果：

前四名学生,各科成绩的最大分：[96 97 72 98 89]
前四名学生,各科成绩的最小分：[55 57 45 76 77]
前四名学生,各科成绩波动情况：[16.25576821 14.92271758 10.40432602  8.0311892   4.32290412]
前四名学生,各科成绩的平均分：[78.5  75.75 62.5  85.   82.25]

如果需要统计出某科最高分对应的是哪个同学？

np.argmax(temp, axis=)
np.argmin(temp, axis=)

print("前四名学生，各科成绩最高分对应的学生下标：{}".format(np.argmax(temp, axis=0)))

结果：

前四名学生，各科成绩最高分对应的学生下标：[0 2 0 0 1]

5 小结

逻辑运算【知道】
- 直接进行大于,小于的判断
- 合适之后,可以直接进行赋值
通用判断函数【知道】
- np.all()
- np.any()
统计运算【掌握】
- np.max()
- np.min()
- np.median()
- np.mean()
- np.std()
- np.var()
- np.argmax(axis=) — 最大元素对应的下标
- np.argmin(axis=) — 最小元素对应的下标