Chapter 2 Introduction to NumPy
小贴士
Reminder About Built-In Documentation
As you read through this chapter, don’t forget that IPython gives you the ability to
quickly explore the contents of a package (by using the tab-completion feature) as
well as the documentation of various functions (using the ? character). Refer back to
“Help and Documentation in IPython” on page 3 if you need a refresher on this.
IPython能够快速帮助我们历遍这个类里具有的功能,例如numpy
import numpy as np
np.<tab>
# 在IPython中输入以上内容,会出现下来列表包含np包里的选项。
Understanding Data Types in Python
A Python Integer is More Than Just an Integer
- python其实低层是用c语言来写的,因此python语言是一个修饰好的c语言。因此变量是与内存对话的。
- 例如:x = 1000, 其中的x不仅是一个变量,也是一个指向内存的c语言指针。
A Python List Is More Than Jut a List
-
小技巧1
-
L = list(range(10)) L ''' [0,1,2,3,4,5,6,7,8,9] ''' S = [str(i) for i in L] S ''' ['0','1','2','3','4','5','6','7','8','9'] '''
-
在numpy中ndarray类中,元素需要是同一类型的。
Fixed-Type Arrays in Python
Creating Arrays from Python Lists
import numpy as np
np.array([1,3,4,5,6])
'''array([1,3,4,5,6])'''
# 如果ndarray中变量的类型不一致,会强行转化为统一数据类型。
np.array([3.14,4,2,3])
'''
array([3.14,4.,2.,3.])
'''
# 也可以声明数据类型
np.array([1,2,3,4], dtype='float32')
'''
array([1.,2.,3.,4.], dtype=float)
'''
# numpy矩阵可以由多个维度
# nested lists result in multidimensional arrays
np.array([range(i, i+3) fro i in [2,4,6]])
'''
array([ [2,3,4]
[4,5,6]
[6,7,8]])
'''
Creating Arrays from Scratch
使用numpy自带的数组创建功能会更有效率
# create a length-10 integer array filled with zeros
np.zeros(10, dtype=int)
'''array([0,0,0,0,0,0,0,0,0,0])'''
# create a 3*5 floating-point array filled with 1s
np.ones((3, 5), dtype=float)
'''
array([ [1.,1.,1.,1.,1.]
[1.,1.,1.,1.,1.]
[1.,1.,1.,1.,1.]])
'''
# create a 3*5 array filled with 3.14
np.full((3, 5), 3.14)
'''
array([ [3.14, 3.14, 3.14, 3.14, 3.14]
[3.14, 3.14, 3.14, 3.14, 3.14]
[3.14, 3.14, 3.14, 3.14, 3.14]])
'''
# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
np.arange(0, 20, 2)
'''
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
'''
# Create an array of five values evenly spaced between 0 and 1
np.linspace(0, 1, 5)
'''
array([ 0. , 0.25, 0.5 , 0.75, 1. ])
'''
# Create a 3x3 array of uniformly distributed
# random values between 0 and 1
np.random.random((3, 3))
'''
array([ [ 0.99844933, 0.52183819, 0.22421193],
[ 0.08007488, 0.45429293, 0.20941444],
[ 0.14360941, 0.96910973, 0.946117 ]])
'''
# Create a 3x3 array of normally distributed random values
# with mean 0 and standard deviation 1
np.random.normal(0, 1, (3, 3))
'''
array([ [ 1.51772646, 0.39614948, -0.10634696],
[ 0.25671348, 0.00732722, 0.37783601],
[ 0.68446945, 0.15926039, -0.70744073]])
'''
# Create a 3x3 array of random integers in the interval [0, 10)
np.random.randint(0, 10, (3, 3))
'''
array([ [2, 3, 4],
[5, 7, 8],
[0, 5, 0]])
'''
# Create a 3x3 identity matrix
np.eye(3)
'''
array([ [ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
'''
# Create an uninitialized array of three integers
# The values will be whatever happens to already exist at that
# memory location
np.empty(3)
'''array([ 1., 1., 1.])'''
NumPy Standard Data Types
在制定ndarray数据的类型时,可以通过字符指定,如:
np.zeros(10, dtype='int16')
或者使用NumPy对象来指定,如:
np.zeros(10, dtype=np.int16)
常用数据类型有:
Data type | Description |
---|---|
bool_ | Boolean (True or False) stored as a byte |
int_ | Default integer type (same as C long; normally either int64 or int32) |
intc | Identical to C int (normally int32 or int64) |
intp | Integer used for indexing (same as C ssize_t; normally either int32 or int64) |
int8 | Byte (–128 to 127) |
int16 | Integer (–32768 to 32767) |
int32 | Integer (–2147483648 to 2147483647) |
int64 | Integer (–9223372036854775808 to 9223372036854775807) |
uint8 | Unsigned integer (0 to 255) |
uint16 | Unsigned integer (0 to 65535) |
uint32 | Unsigned integer (0 to 4294967295) |
uint64 | Unsigned integer (0 to 18446744073709551615) |
float_ | Shorthand for float64 |
float16 | Half-precision float: sign bit, 5 bits exponent, 10 bits mantissa |
float32 | Single-precision float: sign bit, 8 bits exponent, 23 bits mantissa |
float64 | Double-precision float: sign bit, 11 bits exponent, 52 bits mantissa |
complex_ | Shorthand for complex128 |
complex64 | Complex number, represented by two 32-bit floats |
complex128 | Complex number, represented by two 64-bit floats |
The Basics of NumPy Arrays
使用numpy来操纵数据。
通常有五种操作
Attributes of arrays
Determining the size, shape, memory consumption, and data types of arrays
Indexing of arrays
Getting and setting the value of individual array elements
Slicing of arrays
Getting and setting smaller subarrays within a larger array
Reshaping of arrays
Changing the shape of a given array
Joining and splitting of arrays
Combining multiple arrays into one, and splitting one array into many
NumPy Array Attributes
import numpy as np
np.random.seed(0) # seed for reproducibility
x1 = np.random.randint(10, size=6) # One-dimensional array
x2 = np.random.randint(10, size=(3, 4)) # Two-dimensional array
x3 = np.random.randint(10, size=(3, 4, 5)) # Three-dimensional array
'''
ndim: number of dimensions 维度
shape: the size of each dimension 长宽
size: the total size of the array 总数量
dtype: data type of the array 数据类型
itemsize: lists the size of each array(bytes) 单个数据大小
nbytes: total size of the array(bytes) 整个向量数据大小
'''
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)
'''
x3 ndim: 3
x3 shape: (3, 4, 5)
x3 size: 60
'''
print("dtype:", x3.dtype)
print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")
'''
dtype: int64
itemsize: 8 bytes
nbytes: 480 bytes
'''
Array Indexing: Accessing Single Elements
类似于list的获取方式。
x1
'''array([5, 0, 3, 3, 7, 9])'''
x1[0]
'''5'''
x1[4]
'''7'''
x1[-1]
'''9'''
x1[-2]
'''7'''
x2
'''
array([ [3, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])
'''
x2[0,0]
'''3'''
x2[2,0]
'''1'''
x2[2,-1]
'''7'''
# 更改数字
x2[0,0] = 12
x2
'''
array([ [12, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])
'''
Keep in mind that, unlike Python lists, NumPy arrays have a fixed type. This means,
for example, that if you attempt to insert a floating-point value to an integer array, the
value will be silently truncated. Don’t be caught unaware by this behavior!
x1[0] = 3.14159
x1
'''
array([3, 0, 3, 3, 7, 9])
'''
Array Slicing: Accessing Subarrays
One-dimensional subarrays
The NumPy slicing syntax follows that of the standard Python list; to access a slice of an array x, use this:
x[start:stop:step]
'''
default setting:
start=0, stop=size of dimension, step=1
'''
examples:
x = np.arange(10)
x
'''array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])'''
x[:5] # first five elements
'''array([0, 1, 2, 3, 4])'''
x[5:] # elements after index 5
'''array([5, 6, 7, 8, 9])'''
x[4:7] # middle subarray
'''array([4, 5, 6])'''
x[::2] # every other element
'''array([0, 2, 4, 6, 8])'''
x[1::2] # every other element, starting at index 1
'''array([1, 3, 5, 7, 9])'''
# 小技巧
x[::-1] # all elements, reversed
'''array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])'''
x[5::-2] # reversed every other from index 5
'''array([5, 3, 1])'''
multidimensional subarrays
其实与单维切片类似,多一个参数而已
x2
'''
array([ [12, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])
'''
x2[:2, :3] # two rows, three columns
'''
array([ [12, 5, 2],
[ 7, 6, 8]])
'''
x2[:3, ::2] # all rows, every other column
'''
array([ [12, 2],
[ 7, 8],
[ 1, 7]])
'''
# subarray dimensions can even be reversed together:
x2[::-1, ::-1]
'''
array([ [ 7, 7, 6, 1],
[ 8, 8, 6, 7],
[ 4, 2, 5, 12]])
'''
Acessing array rows and columns
print(x2[:, 0]) # first column of x2
'''
[12 7 1]
'''
print(x2[0, :]) # first row of x2
# 等价于
print(x2[0]) # equivalent to x2[0, :]
'''
[12 5 2