Python Numpy data-type dtype 自定义数据类型

最新推荐文章于 2025-04-04 11:10:47 发布

青盏

最新推荐文章于 2025-04-04 11:10:47 发布

阅读量5w

点赞数 11

分类专栏： machine learning 文章标签： numpy python

本文链接：https://blog.csdn.net/qq_16234613/article/details/65935279

版权

machine learning 专栏收录该内容

25 篇文章

订阅专栏

本文详细介绍了NumPy中数据类型的定义、使用及转换方法。包括BIG-ENDIAN与LITTLE_ENDIAN的区别，自定义复杂数据类型的方法，以及如何利用dtype参数创建不同格式的数据类型。此外还讲解了如何通过astype函数进行数据类型转换。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一、实例

BIG-ENDIAN和LITTLE_ENDIAN区别
数据类型定义：

>>> dt = np.dtype('>i4')  定义一个big-endian int 4*8=32位的数据类型
>>> dt
dtype('>i4')
>>> dt.byteorder    //字节顺序：>为big-edian <为little-endian 
'>'
>>> dt.itemsize    //字节大小
4
>>> dt.name       //dt类型
'int32'
>>> dt.type is np.int32
True

自定义数据类型：
定义dt：

>>> dt = np.dtype([('name', np.str_, 16), ('grades', np.float64, (2,))])   //定义一个数据类型，其中name为16为字符串，grades为2个float64的子数组
>>> dt['name']
dtype('<U16')
>>> dt['grades']
dtype(('<f8',(2,)))

使用：

>>> x = np.array([('Sarah', (8.0, 7.0)), ('John', (6.0, 7.0))], dtype=dt)
>>> x[1]
('John', [6.0, 7.0])
>>> x[1]['grades']
array([ 6.,  7.])
>>> type(x[1])
<type 'numpy.void'>
>>> type(x[1]['grades'])
<type 'numpy.ndarray'>

二、dtype参数

在声明数据类型时dtype能够自动将参数转为相应类型。默认数据类型为float_。
24个內建参数：
24內建Array-scalar types

>>> dt = np.dtype(np.int32)      # 32位int，注意32为位
>>> dt = np.dtype(np.complex128) # 128位复数

numpy.sctypeDict.keys()参数：
存在于numpy.sctypeDict.keys()中的字符串参数：

>>> dt = np.dtype('uint32')   # 32位uint，注意32为位
>>> dt = np.dtype('Float64')  # 64位float

python类型参数：

Tables	Are
int	int_
bool	bool_
float	float_
complex	cfloat
str	string
unicode	unicode_
buffer	void
(all others)	object_

>>> dt = np.dtype(float)   # Python的浮点
>>> dt = np.dtype(int)     # Python的整型
>>> dt = np.dtype(object)  # Python的对象

简略字符参数：

'b'     boolean
'i'     (signed) integer
'u'     unsigned integer
'f'     floating-point
'c'     complex-floating point
'm'     timedelta
'M'     datetime
'O'     (Python) objects
'S', 'a'    (byte-)string
'U'     Unicode
'V'     raw data (void)

>>> dt = np.dtype('f8')   # 64位浮点，注意8为字节
>>> dt = np.dtype('c16')  # 128位复数

带逗号字符串参数：

>>> dt = np.dtype("a3, 3u8, (3,4)a10")  //3字节字符串、3个64位整型子数组、3*4的10字节字符串数组，注意8为字节

其他：
(flexible_dtype, itemsize)第一个参数类型参数大小不固定，第二传入大小：

>>> dt = np.dtype((void, 10))  #10位
>>> dt = np.dtype((str, 35))   # 35字符字符串
>>> dt = np.dtype(('U', 10))   # 10字符unicode string

(fixed_dtype, shape)第一个参数传入固定大小参数，第二参数传入个数

>>> dt = np.dtype((np.int32, (2,2)))          # 2*2int子数组
>>> dt = np.dtype(('S10', 1))                 # 10字符字符串
>>> dt = np.dtype(('i4, (2,3)f8, f4', (2,3))) # 2x3结构子数组

[(field_name, field_dtype, field_shape), …]：

>>> dt = np.dtype([('big', '>i4'), ('little', '<i4')])
>>> dt = np.dtype([('R','u1'), ('G','u1'), ('B','u1'), ('A','u1')])

{‘names’: …, ‘formats’: …, ‘offsets’: …, ‘titles’: …, ‘itemsize’: …}：

>>> dt = np.dtype({'names': ['r','g','b','a'],'formats': [uint8, uint8, uint8, uint8]})

>>> dt = np.dtype({'names': ['r','b'], 'formats': ['u1', 'u1'], 'offsets': [0, 2],'titles': ['Red pixel', 'Blue pixel']})

{‘field1’: …, ‘field2’: …, …}：
不推荐使用，可能会产生冲突

>>> dt = np.dtype({'col1': ('S10', 0), 'col2': (float32, 10),'col3': (int, 14)}) //col1在字节0处，col2在字节10处，col3在字节14处

(base_dtype, new_dtype)：

>>> dt = np.dtype((np.int32,{'real':(np.int16, 0),'imag':(np.int16, 2)})  //base_dtype前两个字节放置real，后两个字节放置imag
>>>dt = np.dtype((np.int32, (np.int8, 4)))  //base_dtype被分成4个int8的子数组

三、切换类型

使用astype，不可直接更改对象的dtype值

>>> b = np.array([1., 2., 3., 4.])
>>> b.dtype
dtype(‘float64‘)
>>> c = b.astype(int)
>>> c
array([1, 2, 3, 4])
>>> c.shape
(8,)
>>> c.dtype
dtype(‘int32‘)

>>> b
array([ 1.,  2.,  3.,  4.])
>>> b.dtype = ‘int‘
>>> b.dtype
dtype(‘int32‘)
>>> b
array([0, 1072693248,0, 1073741824,0,1074266112,          0, 1074790400])  //数组长度加倍
>>> b.shape
(8,)