第二章 Numpy数组
Numpy数组优势
#创建数组
In [16]: a=arange(5)
In [17]: a.dtype
Out[17]: dtype('int32')
In [18]: a
Out[18]: array([0, 1, 2, 3, 4])
#返回一个元组,存放每一个维度的长度
In [19]: a.shape
Out[19]: (5,)
创建多维数组
In [20]: m=array([arange(2),arange(2)])
In [21]: m
Out[21]:
array([[0, 1],
[0, 1]])
In [22]: m.shape
Out[22]: (2, 2)
选择numpy数组元素
In [24]: a=array([[1,2],[3,4]])
In [25]: a
Out[25]:
array([[1, 2],
[3, 4]])
In [26]: a[0,0]
Out[26]: 1
In [27]: a[0,1]
Out[27]: 2
In [28]: a[1,0]
Out[28]: 3
In [29]: a[1,1]
Out[29]: 4
Numpy的数值类型
Bool 布尔
Inti 基于平台的整数
Int8 字节类型
Int16 整型-32768~32767
Int32 整型-2(31)~2(31)-1
Int64 整型-2(63)~2(63)-1
Uint8 无符号整型0-255
Uint16 无符号整型
Uint32 无符号整型
Uint64 无符号整型
Float16 半精度浮点型
Float32 单精度浮点型
Float64 双精度浮点型
Complex64 复数类型
Complex128复数类型
#数据类型字串
In [30]: a.dtype.itemsize
Out[30]: 4
In [31]: a.dtype
Out[31]: dtype('int32')
注:pycharm中,如果运行时,python console中自动运行ipython,可以作如下修改:
File->settings->consloe->取消use ipythonif available的选择
字符码
i 整型
u 无符号整型
f 单精度浮点型
d 双精度浮点型
b 布尔型
D 复数型
S 字符型
U 万国码
V 空类型
In [1]: arange(7,dtype='f')
Out[1]: array([ 0., 1., 2., 3., 4., 5., 6.], dtype=float32)
In [3]: arange(7,dtype='D')
Out[3]: array([ 0.+0.j, 1.+0.j, 2.+0.j, 3.+0.j, 4.+0.j, 5.+0.j, 6.+0.j])
Dtype构造函数
#python自带常规浮点型
In [4]: dtype(float)
Out[4]: dtype('float64')
In [5]: dtype('f')
Out[5]: dtype('float32')
In [6]: dtype('d')
Out[6]: dtype('float64')
In [7]: dtype('f8')
Out[7]: dtype('float64')
#列出所有类型的字符码
In [8]: sctypeDict.keys()
Out[8]:
[0,
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
'unicode',
23,
'cfloat',
'longfloat',
'Int32',
'Complex64',
'unicode_',
'complex',
'timedelta64',
'uint16',
'c16',
'float32',
'int32',
'D',
'H',
'void',
'unicode0',
'L',
'P',
'half',
'void0',
'd',
'h',
'l',
'p',
22,
'Timedelta64',
'object0',
'b1',
'M8',
'String0',
'float16',
'ulonglong',
'i1',
'uint32',
'?',
'Void0',
'complex64',
'G',
'O',
'UInt8',
'S',
'byte',
'UInt64',
'g',
'float64',
'ushort',
'float_',
'uint',
'object_',
'Float16',
'complex_',
'Unicode0',
'uintp',
'intc',
'csingle',
'datetime64',
'float',
'bool8',
'Bool',
'intp',
'uintc',
'bytes_',
'u8',
'u4',
'int_',
'cdouble',
'u1',
'complex128',
'u2',
'f8',
'Datetime64',
'ubyte',
'm8',
'B',
'uint0',
'F',
'bool_',
'uint8',
'c8',
'Int64',
'Int8',
'Complex32',
'V',
'int8',
'uint64',
'b',
'f',
'double',
'UInt32',
'clongdouble',
'str',
'f2',
'f4',
'int',
'longdouble',
'single',
'string',
'q',
'Int16',
'Float64',
'longcomplex',
'UInt16',
'bool',
'Float32',
'string0',
'longlong',
'i8',
'int16',
'str_',
'I',
'object',
'M',
'i4',
'singlecomplex',
'Q',
'string_',
'U',
'a',
'short',
'e',
'i',
'clongfloat',
'm',
'Object0',
'int64',
'i2',
'int0']
Dtype属性
#取得类型对应的字符码
In [9]: t=dtype('Float64')
In [10]: t.char
Out[10]: 'd'
#类型属性相当于数组对象的类型
In [11]: t.type
Out[11]: numpy.float64
#取得数据类型字符串.<表示字节顺序,f表示字符码,8表示每个元素所需字节数
In [12]: t.str
Out[12]: '<f8'
一维数组的切片和索引
In [13]: a=arange(9)
#3-7
In [14]: a[3:7]
Out[14]: array([3, 4, 5, 6])
#0-7步长是2
In [15]: a[:7:2]
Out[15]: array([0, 2, 4, 6])
#数组反转
In [16]: a[::-1]
Out[16]: array([8, 7, 6, 5, 4, 3, 2, 1, 0])
处理数组的型状
示例代码如下:
#!/usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2016/12/7 11:45 # @Author : Retacn # @Site : 数组形状的调整 # @File : array_reshap.py # @Software: PyCharm __author__ = "retacn" __copyright__ = "property of mankind." __license__ = "CN" __version__ = "0.0.1" __maintainer__ = "retacn" __email__ = "zhenhuayue@sina.com" __status__ = "Development" import numpy as np print('In:b =arange(24).reshape(2,3,4)') b = np.arange(24).reshape(2, 3, 4) print('In:b') #print(b) # # [[[ 0 1 2 3] # [ 4 5 6 7] # [ 8 9 10 11]] # # [[12 13 14 15] # [16 17 18 19] # [20 21 22 23]]] #拆解 将多维数组变成一维数组 print('In:b.ravel()') #print(b.ravel()) #[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23] #拉直 同上 print('In:b.flatten()') #print(b.flatten()) #[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23] #用元数组指定数组形状 print('In:b.shape(6,4)') b.shape=(6,4) # print(b) # [[ 0 1 2 3] # [ 4 5 6 7] # [ 8 9 10 11] # [12 13 14 15] # [16 17 18 19] # [20 21 22 23]] #转置 行变列,列变行 print('In:b.transpose()') #print(b.transpose()) # [[ 0 4 8 12 16 20] # [ 1 5 9 13 17 21] # [ 2 6 10 14 18 22] # [ 3 7 11 15 19 23]] #调整大小 print('In:b.resize((2,12))') b.resize((2,12)) #print(b) # [[ 0 1 2 3 4 5 6 7 8 9 10 11] # [12 13 14 15 16 17 18 19 20 21 22 23]]
堆叠数组
In [17]: a=arange(9).reshape(3,3)
In [18]: a
Out[18]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [19]: b=2*a
In [20]: b
Out[20]:
array([[ 0, 2, 4],
[ 6, 8, 10],
[12, 14, 16]])
水平叠加
In [21]: hstack((a,b))
Out[21]:
array([[ 0, 1, 2, 0, 2, 4],
[ 3, 4, 5, 6, 8, 10],
[ 6, 7, 8, 12, 14, 16]])
In [22]: concatenate((a,b),axis=1)
Out[22]:
array([[ 0, 1, 2, 0, 2, 4],
[ 3, 4, 5, 6, 8, 10],
[ 6, 7, 8, 12, 14, 16]])
垂直叠加
In [23]: vstack((a,b))
Out[23]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 0, 2, 4],
[ 6, 8, 10],
[12, 14, 16]])
In [24]: concatenate((a,b),axis=0)
Out[24]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 0, 2, 4],
[ 6, 8, 10],
[12, 14, 16]])
深度叠加
In [25]: dstack((a,b))
Out[25]:
array([[[ 0, 0],
[ 1, 2],
[ 2, 4]],
[[ 3, 6],
[ 4, 8],
[ 5, 10]],
[[ 6, 12],
[ 7, 14],
[ 8, 16]]])
列式堆叠
#一维数组
In [26]: oned=arange(2)
In [27]: oned
Out[27]: array([0, 1])
In [29]: twice_oned=2*oned
In [30]: twice_oned
Out[30]: array([0, 2])
In [31]: column_stack((oned,twice_oned))
Out[31]:
array([[0, 0],
[1, 2]])
#二维数组
In [32]: column_stack((a,b))
Out[32]:
array([[ 0, 1, 2, 0, 2, 4],
[ 3, 4, 5, 6, 8, 10],
[ 6, 7, 8, 12, 14, 16]])
In [33]: column_stack((a,b))==hstack((a,b))
Out[33]:
array([[ True, True, True, True, True, True],
[ True, True, True, True, True, True],
[ True, True, True, True, True, True]], dtype=bool)
行式堆叠
#一维数组
In [34]: row_stack((oned,twice_oned))
Out[34]:
array([[0, 1],
[0, 2]])
#二维数组
In [35]: row_stack((a,b))
Out[35]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 0, 2, 4],
[ 6, 8, 10],
[12, 14, 16]])
拆分numpy数组
纵向拆分
In [39]: vsplit(a,3)
Out[39]: [array([[0, 1, 2]]), array([[3, 4,5]]), array([[6, 7, 8]])]
In [41]: split(a,3,axis=0)
Out[41]: [array([[0, 1, 2]]), array([[3, 4,5]]), array([[6, 7, 8]])]
横向拆分
In [36]: a
Out[36]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [37]: hsplit(a,3)
Out[37]:
[array([[0],
[3],
[6]]), array([[1],
[4],
[7]]), array([[2],
[5],
[8]])]
In [38]: split(a,3,axis=1)
Out[38]:
[array([[0],
[3],
[6]]), array([[1],
[4],
[7]]), array([[2],
[5],
[8]])]
深度方向拆分
In [42]: c=arange(27).reshape(3,3,3)
In [43]: c
Out[43]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
In [44]: dsplit(c,3)
Out[44]:
[array([[[ 0],
[ 3],
[ 6]],
[[ 9],
[12],
[15]],
[[18],
[21],
[24]]]), array([[[ 1],
[ 4],
[ 7]],
[[10],
[13],
[16]],
[[19],
[22],
[25]]]), array([[[ 2],
[ 5],
[ 8]],
[[11],
[14],
[17]],
[[20],
[23],
[26]]])]
Numpy数组的属性
#!/usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2016/12/7 13:32 # @Author : Retacn # @Site : 数组的属性 # @File : array_attribute.py # @Software: PyCharm __author__ = "retacn" __copyright__ = "property of mankind." __license__ = "CN" __version__ = "0.0.1" __maintainer__ = "retacn" __email__ = "zhenhuayue@sina.com" __status__ = "Development" import numpy as np b = np.arange(24).reshape(2, 12) print('In:b') # print(b) # [[ 0 1 2 3 4 5 6 7 8 9 10 11] # [12 13 14 15 16 17 18 19 20 21 22 23]] # 取得数组的维度 print('In:b.ndim') # print(b.ndim) # 2 # 元素的数量 print('In:b.size') # print(b.size) # 24 # 各个元素所占用的字节数 print('In:b.itemsize') # print(b.itemsize) # 4 # 要存取整个数组所需要的字节数 print('In:b.nbytes') # print(b.nbytes) # 96 print('In:b.size*b.itemsize') # print(b.size * b.itemsize) # 96 print('In:b.resize(6,4)') # b.resize(6, 4) # print(b) # [[ 0 1 2 3] # [ 4 5 6 7] # [ 8 9 10 11] # [12 13 14 15] # [16 17 18 19] # [20 21 22 23]] # 与transpose()函数相同 print('In:b.T') # print(b.T) # [[ 0 4 8 12 16 20] # [ 1 5 9 13 17 21] # [ 2 6 10 14 18 22] # [ 3 7 11 15 19 23]] # 生成一个复数数组 print('In:b.=array([1.j+1,2.j+3])') b = np.array([1.j + 1, 2.j + 3]) # print(b) # [ 1.+1.j 3.+2.j] # 返回数组的实部 print('In:b.real') # print(b.real) # [ 1. 3.] # 数组的虚部 print('In:b.imag') # print(b.imag) # [ 1. 2.] print('In:b.dtype') # print(b.dtype) # complex128 # 如果数组含有复数,数据类型将自动变为复数类型 print('In:b.dtype,str') # print(b.dtype.str) # <c16 print('In:b=arange(4).reshape(2,2)') b = np.arange(4).reshape(2, 2) # print(b) # [[0 1] # [2 3]] # 返回 numpy.flatiter print('In:f=b.flat') f = b.flat # print(f) # <numpy.flatiter object at 0x029B8438> print('In:for it in f:print(it)') # for it in f: # print(it) # 0 # 1 # 2 # 3 # 查询单个元素 print('In:b.flat[2]') # print(b.flat[2]) # 2 # 查询多个元素 print('In:b.flat[[1,3]]') print(b.flat[[1, 3]]) # [1 3] print('In:b') print(b) # [[0 1] # [2 3]] # 赋值 print('In:b.flat[[1,3]]=1') b.flat[[1, 3]] = 1 print('In:b') print(b) # [[0 1] # [2 1]]
数组的转换
import numpy as np b = np.array([1.j + 1, 2.j + 3]) print(b) #[ 1.+1.j 3.+2.j] #numpy数组转换成python列表 b.tolist() print(b) #[ 1.+1.j 3.+2.j] #把数组元素转换为指定类型 b.astype(int) print(b) #[ 1.+1.j 3.+2.j] #转换为int类型时,虚部将被替换 b.astype('complex') print(b) #[ 1.+1.j 3.+2.j]
创建数组的视图和拷贝
from scipy import misc import matplotlib.pyplot as plt ascent= misc.ascent() # 创建一份视图的拷贝 acopy = ascent.copy() # 为该数组创建一个视图 aview = ascent.view() # 显示图像 plt.subplot(221), plt.imshow(ascent) plt.title(ascent), plt.xticks([]), plt.yticks([]) plt.subplot(222), plt.imshow(acopy) plt.title('acopy'), plt.xticks([]), plt.yticks([]) plt.subplot(223), plt.imshow(aview) plt.title('aview'), plt.xticks([]), plt.yticks([]) # 通过flat迭代器,将视图中所有值全部设为0 aview.flat = 0 plt.subplot(224), plt.imshow(aview) plt.title('aview1'), plt.xticks([]), plt.yticks([]) plt.show()
花式索引
from scipy import misc from matplotlib import pyplot as plt # 读入图像 ascent = misc.ascent() # print(ascent) # 取得x轴y轴的长度 xmax = ascent.shape[0] ymax = ascent.shape[1] # print(range(xmax)) # print(range(ymax)) # print(range(xmax - 1, -1, -1)) # print(ascent[range(xmax), range(ymax)]) # 将一条对角线上的值设为0 ascent[range(xmax), range(ymax)] = 0 # print(ascent[range(xmax), range(ymax)]) # 将别一条对角线上的值设为0 ascent[range(xmax - 1, -1, -1), range(ymax)] = 0 plt.imshow(ascent) plt.show()
基于位置列表的索引方法
from scipy import misc from matplotlib import pyplot as plt import numpy as np # 读入图像 ascent = misc.ascent() # 取得图像的大小 xmax = ascent.shape[0] ymax = ascent.shape[1] # 打乱数组的索引 def shuffle_indices(size): arr = np.arange(size) np.random.shuffle(arr) return arr xindices = shuffle_indices(xmax) print(xindices, len(xindices), xmax) np.testing.assert_equal(len(xindices), xmax) yindices = shuffle_indices(ymax) np.testing.assert_equal(len(yindices), ymax) # 显示打乱后的图像,实际打乱的是位置索引 plt.imshow(ascent[np.ix_(xindices, yindices)]) plt.show()
使用布尔变量索引numpy数组
from scipy import misc from matplotlib import pyplot as plt import numpy as np ascent = misc.ascent() def get_indices(size): arr = np.arange(size) return arr % 4 == 0 # 对角线上可以被4整除的点 ascent1 = ascent.copy() xindices = get_indices(ascent.shape[0]) yindices = get_indices(ascent.shape[1]) ascent1[xindices, yindices] = 0 # 将数组中值大于1/4到3/4的值 设为0 ascent2 = ascent.copy() ascent1[(ascent > ascent.max() / 4) & (ascent < 3 * ascent.max() / 4)] = 0 # 显示图像 plt.subplot(131), plt.imshow(ascent) plt.title('ascent'), plt.xticks([]), plt.yticks([]) plt.subplot(132), plt.imshow(ascent1) plt.title('ascent1'), plt.xticks([]), plt.yticks([]) plt.subplot(133), plt.imshow(ascent2) plt.title('ascent2'), plt.xticks([]), plt.yticks([]) plt.show()
Numpy数组的广播
Python读取wave文件,示例代码如下:
from tkinter import * import wave from matplotlib import pyplot as plt import numpy as np # 打开文件 f = wave.open(r"si2323.wav", 'rb') # 读取格式信息 params = f.getparams() nchannels, sampwidth, framerate, nframes = params[:4] # 读取波型数据 str_data = f.readframes(nframes) f.close() # 将wav波型数据转换为array数组 wave_data = np.fromstring(str_data, dtype=np.short) wave_data.shape = -1, 2 wave_data = wave_data.T time = np.arange(1, nframes) * (1.0 / framerate) # 解决time和wave_data[0]在plot维度不同的问题 len_time = len(time) / 2 + 1 time = time[:int(len_time)] # 显示声音波型 plt.subplot(211) plt.plot(time, wave_data[0]) plt.subplot(212) plt.plot(time, wave_data[1], c='r') plt.xlabel('time') plt.show()
示例代码如下:
from scipy.io import wavfile from matplotlib import pyplot as plt import urllib import numpy as np # response = urllib.request.urlopen('http://www.thesoundarchive.com/austinpowers/smashingbaby.wav') # print(response.info()) # WAV_FILE =r'si2323.wav' # filehandle = open(WAV_FILE, 'w') # filehandle.write(response.read()) # filehandle.close() # 读取音频文件 sample_rate, data = wavfile.read('si2323.wav') print('Data type', data.dtype, 'Shape', data.shape) # 显示原始声音图像 plt.subplot(211), plt.title('Original') plt.plot(data) # 保存wav文件 newdata = data * 0.2 newdata = newdata.astype(np.int16) print('Data type', newdata.dtype, 'Shape', newdata.shape) wavfile.write('quite.wav', sample_rate, newdata) # 显示保存声音图像 plt.subplot(212), plt.title('Quiet') plt.plot(newdata) plt.show()