Python Science Handbook——Chapter 2 Introduction to NumPy 笔记

Chapter 2 Introduction to NumPy

小贴士

Reminder About Built-In Documentation
As you read through this chapter, don’t forget that IPython gives you the ability to
quickly explore the contents of a package (by using the tab-completion feature) as
well as the documentation of various functions (using the ? character). Refer back to
“Help and Documentation in IPython” on page 3 if you need a refresher on this.

IPython能够快速帮助我们历遍这个类里具有的功能,例如numpy

import numpy as np
np.<tab>

# 在IPython中输入以上内容,会出现下来列表包含np包里的选项。

Understanding Data Types in Python

A Python Integer is More Than Just an Integer

  • python其实低层是用c语言来写的,因此python语言是一个修饰好的c语言。因此变量是与内存对话的。
  • 例如:x = 1000, 其中的x不仅是一个变量,也是一个指向内存的c语言指针。

A Python List Is More Than Jut a List

  • 小技巧1

  • L = list(range(10))
    L
    '''
    [0,1,2,3,4,5,6,7,8,9]
    '''
    S = [str(i) for i in L]
    S
    '''
    ['0','1','2','3','4','5','6','7','8','9']
    '''
    
  • 在numpy中ndarray类中,元素需要是同一类型的。

Fixed-Type Arrays in Python

Creating Arrays from Python Lists

import numpy as np
np.array([1,3,4,5,6])
'''array([1,3,4,5,6])'''

# 如果ndarray中变量的类型不一致,会强行转化为统一数据类型。
np.array([3.14,4,2,3])
'''
array([3.14,4.,2.,3.])
'''

# 也可以声明数据类型
np.array([1,2,3,4], dtype='float32')
'''
array([1.,2.,3.,4.], dtype=float)
'''

# numpy矩阵可以由多个维度
# nested lists result in multidimensional arrays
np.array([range(i, i+3) fro i in [2,4,6]])
'''
array([	[2,3,4]
		[4,5,6]
		[6,7,8]])
'''

Creating Arrays from Scratch

使用numpy自带的数组创建功能会更有效率

# create a length-10 integer array filled with zeros
np.zeros(10, dtype=int)
'''array([0,0,0,0,0,0,0,0,0,0])'''

# create a 3*5 floating-point array filled with 1s
np.ones((3, 5), dtype=float)
'''
array([	[1.,1.,1.,1.,1.]
		[1.,1.,1.,1.,1.]
		[1.,1.,1.,1.,1.]])
'''

# create a 3*5 array filled with 3.14
np.full((3, 5), 3.14)
'''
array([	[3.14, 3.14, 3.14, 3.14, 3.14]
		[3.14, 3.14, 3.14, 3.14, 3.14]
		[3.14, 3.14, 3.14, 3.14, 3.14]])
'''

# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
np.arange(0, 20, 2)
'''
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
'''

# Create an array of five values evenly spaced between 0 and 1
np.linspace(0, 1, 5)
'''
array([ 0. , 0.25, 0.5 , 0.75, 1. ])
'''

# Create a 3x3 array of uniformly distributed
# random values between 0 and 1
np.random.random((3, 3))
'''
array([	[ 0.99844933, 0.52183819, 0.22421193],
		[ 0.08007488, 0.45429293, 0.20941444],
		[ 0.14360941, 0.96910973, 0.946117 ]])
'''

# Create a 3x3 array of normally distributed random values
# with mean 0 and standard deviation 1
np.random.normal(0, 1, (3, 3))
'''
array([	[ 1.51772646, 0.39614948, -0.10634696],
		[ 0.25671348, 0.00732722, 0.37783601],
		[ 0.68446945, 0.15926039, -0.70744073]])
'''

# Create a 3x3 array of random integers in the interval [0, 10)
np.random.randint(0, 10, (3, 3))
'''
array([	[2, 3, 4],
		[5, 7, 8],
		[0, 5, 0]])
'''

# Create a 3x3 identity matrix
np.eye(3)
'''
array([	[ 1., 0., 0.],
		[ 0., 1., 0.],
		[ 0., 0., 1.]])
'''

# Create an uninitialized array of three integers
# The values will be whatever happens to already exist at that
# memory location
np.empty(3)
'''array([ 1., 1., 1.])'''

NumPy Standard Data Types

在制定ndarray数据的类型时,可以通过字符指定,如:

np.zeros(10, dtype='int16')

或者使用NumPy对象来指定,如:

np.zeros(10, dtype=np.int16)

常用数据类型有:

Data type Description
bool_ Boolean (True or False) stored as a byte
int_ Default integer type (same as C long; normally either int64 or int32)
intc Identical to C int (normally int32 or int64)
intp Integer used for indexing (same as C ssize_t; normally either int32 or int64)
int8 Byte (–128 to 127)
int16 Integer (–32768 to 32767)
int32 Integer (–2147483648 to 2147483647)
int64 Integer (–9223372036854775808 to 9223372036854775807)
uint8 Unsigned integer (0 to 255)
uint16 Unsigned integer (0 to 65535)
uint32 Unsigned integer (0 to 4294967295)
uint64 Unsigned integer (0 to 18446744073709551615)
float_ Shorthand for float64
float16 Half-precision float: sign bit, 5 bits exponent, 10 bits mantissa
float32 Single-precision float: sign bit, 8 bits exponent, 23 bits mantissa
float64 Double-precision float: sign bit, 11 bits exponent, 52 bits mantissa
complex_ Shorthand for complex128
complex64 Complex number, represented by two 32-bit floats
complex128 Complex number, represented by two 64-bit floats

The Basics of NumPy Arrays

使用numpy来操纵数据。

通常有五种操作

Attributes of arrays
Determining the size, shape, memory consumption, and data types of arrays
Indexing of arrays
Getting and setting the value of individual array elements
Slicing of arrays
Getting and setting smaller subarrays within a larger array
Reshaping of arrays
Changing the shape of a given array
Joining and splitting of arrays
Combining multiple arrays into one, and splitting one array into many

NumPy Array Attributes

import numpy as np
np.random.seed(0) # seed for reproducibility

x1 = np.random.randint(10, size=6) # One-dimensional array
x2 = np.random.randint(10, size=(3, 4)) # Two-dimensional array
x3 = np.random.randint(10, size=(3, 4, 5)) # Three-dimensional array
'''
ndim: number of dimensions 维度
shape: the size of each dimension 长宽
size: the total size of the array 总数量
dtype: data type of the array 数据类型
itemsize: lists the size of each array(bytes) 单个数据大小
nbytes: total size of the array(bytes) 整个向量数据大小
'''

print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)
'''
x3 ndim: 3
x3 shape: (3, 4, 5)
x3 size: 60
'''

print("dtype:", x3.dtype)
print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")
'''
dtype: int64
itemsize: 8 bytes
nbytes: 480 bytes
'''

Array Indexing: Accessing Single Elements

类似于list的获取方式。

x1
'''array([5, 0, 3, 3, 7, 9])'''
x1[0]
'''5'''
x1[4]
'''7'''
x1[-1]
'''9'''
x1[-2]
'''7'''

x2
'''
array([	[3, 5, 2, 4],
		[7, 6, 8, 8],
		[1, 6, 7, 7]])
'''
x2[0,0]
'''3'''
x2[2,0]
'''1'''
x2[2,-1]
'''7'''

# 更改数字
x2[0,0] = 12
x2
'''
array([	[12, 5, 2, 4],
		[7, 6, 8, 8],
		[1, 6, 7, 7]])
'''

Keep in mind that, unlike Python lists, NumPy arrays have a fixed type. This means,
for example, that if you attempt to insert a floating-point value to an integer array, the
value will be silently truncated. Don’t be caught unaware by this behavior!

x1[0] = 3.14159
x1
'''
array([3, 0, 3, 3, 7, 9])
'''

Array Slicing: Accessing Subarrays

One-dimensional subarrays

The NumPy slicing syntax follows that of the standard Python list; to access a slice of an array x, use this:

x[start:stop:step]
'''
default setting:
start=0, stop=size of dimension, step=1
'''

examples:

x = np.arange(10)
x
'''array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])'''
x[:5] # first five elements
'''array([0, 1, 2, 3, 4])'''
x[5:] # elements after index 5
'''array([5, 6, 7, 8, 9])'''
x[4:7] # middle subarray
'''array([4, 5, 6])'''
x[::2] # every other element
'''array([0, 2, 4, 6, 8])'''
x[1::2] # every other element, starting at index 1
'''array([1, 3, 5, 7, 9])'''

# 小技巧
x[::-1] # all elements, reversed
'''array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])'''
x[5::-2] # reversed every other from index 5
'''array([5, 3, 1])'''

multidimensional subarrays

其实与单维切片类似,多一个参数而已

x2
'''
array([	[12, 5, 2, 4],
		[7, 6, 8, 8],
		[1, 6, 7, 7]])
'''

x2[:2, :3] # two rows, three columns
'''
array([	[12, 5, 2],
		[ 7, 6, 8]])
'''

x2[:3, ::2] # all rows, every other column
'''
array([	[12, 2],
		[ 7, 8],
		[ 1, 7]])
'''

# subarray dimensions can even be reversed together:
x2[::-1, ::-1]
'''
array([	[ 7, 7, 6, 1],
		[ 8, 8, 6, 7],
		[ 4, 2, 5, 12]])
'''

Acessing array rows and columns

print(x2[:, 0]) # first column of x2
'''
[12 7 1]
'''
print(x2[0, :]) # first row of x2
# 等价于
print(x2[0]) # equivalent to x2[0, :]
'''
[12 5 2
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Python Data Science Handbook(中文名为 Python数据科学手册)是一本非常受欢迎的数据科学入门书籍,由Jake VanderPlas所撰写,并可在CSDN等平台上找到相关资源。 Python数据科学手册通过使用Python编程语言来介绍数据科学的基本概念、工具和技术。该手册涵盖了广泛的主题,包括数据分析、数据可视化、机器学习和深度学习等领域。它不仅提供了理论知识,还包含了大量的代码示例和实践项目,帮助读者理解和应用这些概念。 这本手册的特点之一是它使用了一系列常用的Python库,如NumPy、Pandas和Matplotlib等,这些库被广泛应用于数据科学领域。通过学习这些库的用法,读者可以了解如何使用Python进行数据处理、数据可视化和机器学习等操作。 Python数据科学手册适合广大热衷于数据科学的读者,包括初学者和有一定经验的人。对于初学者来说,这本手册提供了一个循序渐进的学习路径,帮助他们逐步掌握数据科学基础知识和常用工具。对于有经验的读者来说,这本手册提供了一些高级主题和案例研究,帮助他们深入理解数据科学的各个方面。 总之,Python数据科学手册是一本权威且实用的数据科学教材,通过Python编程语言和相关库的学习,读者可以获得在实际数据分析和机器学习项目中所需的基本技能和知识。在CSDN等平台上可以找到该书籍的相关资源和讨论,为读者提供更多的学习和交流机会。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值