1. NumPy介绍

NumPy

NumPy is a Linear Algebra Library for Python, the reason it is so important for data science with Python is that almost all of th e libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks.

Numpy is also increditbly fast, as it has bindings to C libraries.For more info on why you would want to use Arrays instead of lists, check out the following link: https://stackoverflow.com/questions/993984/what-are-the-advantages-of-numpy-over-regular-python-lists.

Using Numpy
import numpy as np
Numpy Arrays

Numpy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly 1-d arrays and matrices are 2-d (but a matrix can still have only one row or one column)

  1. Creating Numpy Arrays
    my_list = [1,2,3]
    np.array(my_list) 
    # array([1,2,3])
    my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
    np.array(my_matrix)
    # array([1,2,3],
    #       [4,5,6],
    #       [7,8,9]])
    
    Built-in Methods
    arange: return evenly spaced values within a given interval
    np.arange(0,10)
    # array([0,1,2,3,4,5,6,7,8,9])
    np.arange(0,11,2)
    # array([0,2,4,6,8,10])
    
    zeros and ones: generate arrays of zeros or ones
    np.zeros(3)
        # array([0., 0., 0.])
        np.zeors((5,5))
        #array([[ 0.,  0.,  0.,  0.,  0.],
        #       [ 0.,  0.,  0.,  0.,  0.],
        #       [ 0.,  0.,  0.,  0.,  0.],
        #       [ 0.,  0.,  0.,  0.,  0.],
        #       [ 0.,  0.,  0.,  0.,  0.]])
        np.ones(3)
        #array([1., 1., 1.])
        np.ones((3,3))
        # array([[ 1.,  1.,  1.],
        #        [ 1.,  1.,  1.],
        #        [ 1.,  1.,  1.]])
    
    linspace: return evenly spaced numbers over a specified interval
    np.linspace(0,10,3)
    # array([0., 5., 10.])
    
    eye: creates an identity matrix
    np.eye(4)
    # array([[ 1., 0., 0., 0.],
    #[ 0., 1., 0., 0.],
    #[ 0., 0., 1., 0.],
    #[ 0., 0., 0., 1.]])
    
  2. Random
    Numpy also has lots of ways to create random number arrays.

    rand: create an array of the given shape and populate it with random samples from a uniform distribution over [0,1)
    np.random.rand(2)
    # array([ 0.11570539, 0.35279769])
    np.random.rand(5,5)
    # array([[ 0.66660768, 0.87589888, 0.12421056, 0.65074126, 0.60260888],
    #       [ 0.70027668, 0.85572434, 0.8464595 , 0.2735416 , 0.10955384],
    #       [ 0.0670566 , 0.83267738, 0.9082729 , 0.58249129, 0.12305748],
    #       [ 0.27948423, 0.66422017, 0.95639833, 0.34238788, 0.9578872 ],
    #       [ 0.72155386, 0.3035422 , 0.85249683, 0.30414307, 0.79718816]])
    
    randn: return a sample(or samples) from the standard normal distritbution
    np.random.randn(2)
    # array([-0.27954018, 0.90078368])
    np.random.randn(5,5)
    # array([[ 0.70154515, 0.22441999, 1.33563186, 0.82872577, -0.28247509],
    #       [ 0.64489788, 0.61815094, -0.81693168, -0.30102424, -0.29030574],
    #       [ 0.8695976 , 0.413755 , 2.20047208, 0.17955692, -0.82159344],
    #       [ 0.59264235, 1.29869894, -1.18870241, 0.11590888, -0.09181687],
    #       [-0.96924265, -1.62888685, -2.05787102, -0.29705576, 0.68915542]])
    
    randint: return random integers from low(inclusive) to high(exclusive)
    np.random.randint(1,100)
    # 44
    np.random.randint(1,100,10)
    # array([13, 64, 27, 63, 46, 68, 92, 10, 58, 24])
    
  3. Array Attributes and Methods
    Reshape: returns an array containing the same data with a new shape
    arr.reshape(5,5)
    
    max,min,argmax,argmin: these are useful methods for finding max or min values. Or to find their index locations using argmin or argmax
    ranarr = array([10,12,41,17,49,2,46,3,19,39])
    ranarr.max() #49
    ranarr.argmax() #4
    ranarr.min() #2
    ranarr.argmin() #5
    
    Shape: Shape is an attribute that arrays have
    arr = array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24])
    arr.shape # (25,)
    arr.reshape(1,25).shape #(1,25)
    
    dtype: grab the data type of the object in the array
    arr.dtype #dtype('int64')
    
Quick Note on Array Indexing

Bracket Indexing and Selection

arr = np.arange(0,11)
arr[8] #8
arr[1:5] #array([1,2,3,4])

Broadcasting: setting a value with index range

arr[0:5] = 100
#array([100, 100, 100, 100, 100, 5, 6, 7, 8, 9, 10])

important notes on slices

arr = np.arange(0,11)
slice_of_arr = arr[0:6] # array([0,1,2,3,4,5])
slice_of_arr[:]=99 #array([99,99,99,99,99])
arr #array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])
#Data is not copied, it's a view of the original array. This avoids memory problems.
# to get a copy, need to be explicit
arr_copy = arr.copy()

Indexing a 2D array(matrices)

arr_2d[row][col] or arr_2d[row,col]

arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))
# array([[ 5, 10, 15],
#       [20, 25, 30],
#       [35, 40, 45]])
arr_2d[1] #array([20,25,30])
arr_2d[1][0] #20
arr_2d[1,0] #20
arr_2d[:2,1:]
# array([[10, 15],
[25, 30]])
arr_2d[2] #array([35,40,45])
arr_2d[2,:] #array([35,40,45])

Fancy Indexing
Fancy indexing allows you to select entire rows or columns out of order.

arr2d = np.zeros((10,10))
arr_length = arr2d.shape[1]
for i in range(arr_length):
        arr2d[i] = i
#array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
#       [ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
#       [ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
#       [ 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
#       [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
#       [ 5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
#       [ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
#       [ 7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
#       [ 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
#       [ 9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])
arr2d[[2,4,6,8]]
#array([[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
[ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
[ 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])
arr2d[[6,4,2,7]]
#array([[ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[ 7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])

Selection

arr= np.arrange(1,11)
bool_arr = arr>4 #array([False, False, False, False, True, True, True, True, True, True], dtype=bool)
arr[boo_arr] #array([ 5, 6, 7, 8, 9, 10])
arr[arr>2] #array([ 3, 4, 5, 6, 7, 8, 9, 10])
Numpy Operations

Arithmetic

import numpy as nparr= np.arange(0,10)
arr+arr #array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
arr*arr #array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81])
arr-arr #array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
arr/arr #array([ nan, 1., 1., 1., 1., 1., 1., 1., 1., 1.])# Warning on division by zero, but not an error. Replaced with nan(not a number)
1/arr #array([ inf, 1. , 0.5 , 0.33333333, 0.25 , 0.2 , 0.16666667, 0.14285714, 0.125 , 0.11111111])
#Also warning, but not an error instead infinity
arr**3 #array([ 0, 1, 8, 27, 64, 125, 216, 343, 512, 729])

Universal Array Functions

#Taking Square Roots
np.sqrt(arr)
#array([ 0. , 1. , 1.41421356, 1.73205081, 2. ,
2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ])
#Calculating exponential(e')
np.exp(arr)
#array([ 1.00000000e+00, 2.71828183e+00, 7.38905610e+00,
2.00855369e+01, 5.45981500e+01, 1.48413159e+02,
4.03428793e+02, 1.09663316e+03, 2.98095799e+03,
8.10308393e+03])
np.max(arr) #same as arr.max()
np.sin(arr)
#array([ 0. , 0.84147098, 0.90929743, 0.14112001, -0.7568025 ,
-0.95892427, -0.2794155 , 0.6569866 , 0.98935825, 0.41211849])
np.log(arr)
#array([ -inf, 0. , 0.69314718, 1.09861229, 1.38629436,
1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458])
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值