Numpy学习——数组类型

最新推荐文章于 2023-02-27 22:08:32 发布

小小硕、

最新推荐文章于 2023-02-27 22:08:32 发布

阅读量478

点赞数

分类专栏：算法与数据结构 Numpy Python 文章标签： python numpy 数据分析

本文链接：https://blog.csdn.net/weixin_41676930/article/details/108826875

版权

算法与数据结构同时被 3 个专栏收录

17 篇文章 0 订阅

订阅专栏

Numpy

3 篇文章 0 订阅

订阅专栏

Python

3 篇文章 0 订阅

订阅专栏

Numpy学习——数组类型

更多的数据类型

转换(casting)

numpy会自动转换高精度数据类型：

>>> np.array([1, 2, 3]) + 1.5
array([2.5, 3.5, 4.5])

赋值不会改变原有数据类型：

>>> a = np.array([1, 2, 3])
>>> a.dtype
dtype('int32')
>>> a[0] = 1.9
>>> a
array([1, 2, 3])

凑整：

>>> a = np.array([1.2, 1.5, 1.6, 2.5, 3.5, 4.5])
>>> b = np.around(a) #四舍五入
>>> b
array([1., 2., 2., 2., 4., 4.])
>>> c = np.around(a).astype(int)
>>> c
array([1, 2, 2, 2, 4, 4])

不同数据类型的大小

有符号整数：

类型	位数
int8	8 bits
int16	16 bits
int32	32 bits
int64	64 bits

>>> np.array([1], dtype=int).dtype
dtype('int32')
>>> np.iinfo(np.int32).max, 2**31 - 1
(2147483647, 2147483647)

无符号整数：

类型	位数
uint8	8 bits
uint16	16 bits
uint32	32 bits
uint64	64 bits

>>> np.iinfo(np.uint32).max, 2**32 - 1
(4294967295, 4294967295)

浮点数类型：

类型	位数
float16	16 bits
float32	32 bits
float64	64 bits
float96	96 bits
float128	128 bits

>>> np.finfo(np.float32).eps
1.1920929e-07
>>> np.finfo(np.float64).eps
2.220446049250313e-16
>>> np.float32(1e-8) + np.float32(1) == 1
True
>>> np.float64(1e-8) + np.float64(1) == 1
False

复合浮点数(complex)：

类型	位数
complex64	two 32-bit floats
complex128	two 64-bit floats
complex192	two 96-bit floats
complex256	two 128-bit floats

结构体类型

变量	类型
sensor_code	(4-character string)
position	(float)
value	(float)

>>> samples = np.zeros((6,), dtype=[('sensor_code', 'S4'), ('position', float), ('value', float)])
>>> samples.ndim
1
>>> samples.shape
(6,)
>>> samples.dtype.names
('sensor_code', 'position', 'value')
>>> samples[:] = [('ALFA',   1, 0.37), ('BETA', 1, 0.11), ('TAU', 1,   0.13),('ALFA', 1.5, 0.37), ('ALFA', 3, 0.11), ('TAU', 1.2, 0.13)]
>>> samples
array([(b'ALFA', 1. , 0.37), (b'BETA', 1. , 0.11), (b'TAU', 1. , 0.13),
       (b'ALFA', 1.5, 0.37), (b'ALFA', 3. , 0.11), (b'TAU', 1.2, 0.13)],
      dtype=[('sensor_code', 'S4'), ('position', '<f8'), ('value', '<f8')])

通过索引名称访问某个字段：

>>> samples['sensor_code']
array([b'ALFA', b'BETA', b'TAU', b'ALFA', b'ALFA', b'TAU'], dtype='|S4')
>>> samples['value']
array([0.37, 0.11, 0.13, 0.37, 0.11, 0.13])
>>> samples[0]
(b'ALFA', 1., 0.37)
>>> samples[0]['sensor_code'] = 'TAU'
>>> samples[0]
(b'TAU', 1., 0.37)

同时访问多个字段：

>>> samples[['position', 'value']]
array([(1. , 0.37), (1. , 0.11), (1. , 0.13), (1.5, 0.37), (3. , 0.11),
       (1.2, 0.13)],
      dtype={'names':['position','value'], 'formats':['<f8','<f8'], 'offsets':[4,12], 'itemsize':20})

花式索引：

>>> samples[samples['sensor_code'] == b'ALFA']
array([(b'ALFA', 1.5, 0.37), (b'ALFA', 3. , 0.11)],
      dtype=[('sensor_code', 'S4'), ('position', '<f8'), ('value', '<f8')])

注意：还有其他一些构造结构体数组的语法。

处理丢失的数据

对于浮点类型我们可以使用NaN，但是mask适用于所有数据类型：

>>> x = np.ma.array([1, 2, 3, 4], mask=[0, 1, 0, 1])
>>> x
masked_array(data=[1, --, 3, --],
             mask=[False,  True, False,  True],
       fill_value=999999)
>>> y = np.ma.array([1, 2, 3, 4], mask=[0, 1, 1, 1])
>>> x + y
masked_array(data=[2, --, --, --],
             mask=[False,  True,  True,  True],
       fill_value=999999)

用于函数处理：

>>> np.ma.sqrt([1, -1, 2, -2])
masked_array(data=[1.0, --, 1.4142135623730951, --],
             mask=[False,  True, False,  True],
       fill_value=1e+20)