python pandas basics

python 知识点整理(六)

本文只是对python部分知识点进行学习和整理
本篇主要是针对python的numpy basics的总结

相较于list,numpy有着更快速的效率
reason:1.fixed type 2.contiguous memory
numpy some advantages: ndarray多维阵列/ math function/tools for reading writing to disk/ 更多的数学处理功能/API

creating ndarrays

np.array

numpy.array(object, dtype = None, copy = True, order = None, subok = False, ndmin = 0)

copy: if the object needs to be copied
order:C is row direction;F is column direction;A is arbitrary direction
subok: an array consistent with the base class typpe is retruned
ndmin:minimum dimension of the denerated array

import numpy as np
data2=[[1,2,3,4],[5,6,7,8]]
arr2=np.array(data2)
print(arr2)
print(arr2.ndim)
print(arr2.shape)
[[1 2 3 4]
 [5 6 7 8]]
2
(2, 4)

np.dtype

numpy.dtype(object,align,copy)

align:if true, fill the field
copy:copy the dtype object. false means a reference to build-in data type object

data3=np.random.randn(2,3)
print(data3)
print(data3*10)
print(data3+data3)
print(data3.shape)
print(data3.dtype)
[[ 1.75302313  1.11403706 -0.44986101]
 [-0.51547876 -0.22752194  0.44738767]]
[[17.53023126 11.14037063 -4.49861006]
 [-5.15478762 -2.2752194   4.47387673]]
[[ 3.50604625  2.22807413 -0.89972201]
 [-1.03095752 -0.45504388  0.89477535]]
(2, 3)
float64

np.empty

numpy.empty(shape, dtype = float, order = 'C')

空数组 但是和全0数组是有区别的
shape:shape of the array(tuple)
order: row first/column firse

a=np.empty((1,2))
a
array([[3.18547019e+283, 4.78629679e-185]])

data type

arr3=np.array([1,2,3],dtype=np.float64)
arr4=np.array([1,2,3],dtype=np.int32)
print(arr3.dtype,arr4.dtype)
float64 int32

use the .astype to convert the data type

arr3_int=arr3.astype(np.int32)
arr3_int
array([1, 2, 3])

Arithmetic with NumPy Arrays

数组相关数学运算
并且相同大小的数组可以进行布尔比较

arr5=np.array([[1,2,3],[4,5,6]])
print(arr5*arr5)
print(arr5-arr5)
print(1/arr5)
print(arr5*0.5)
arr5_compare=np.array([[0,4,1],[7,2,12]])
print(arr5>arr5_compare)
[[ 1  4  9]
 [16 25 36]]
[[0 0 0]
 [0 0 0]]
[[1.         0.5        0.33333333]
 [0.25       0.2        0.16666667]]
[[0.5 1.  1.5]
 [2.  2.5 3. ]]
[[ True False  True]
 [False  True False]]

Indexing and Slicing

one dimension
arr6=np.arange(10)
print(arr6)
print(arr6[4])
print(arr6[4:7])
arr6[5:7]=99#chage the item value
print(arr6)
[0 1 2 3 4 5 6 7 8 9]
4
[4 5 6]
[ 0  1  2  3  4 99 99  7  8  9]
high dimension

高维进行索引的时候首先针对不再是单个的标量 而是一个个小数组
单独的标量可以多次递归访问得到

arr_2d=np.array([[1,2,3],[4,5,6],[7,8,9]])
print(arr_2d[1])
print(arr_2d[0,1])
print(arr_2d[0][1])
[4 5 6]
2
2

3_D

arr_3d=np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
print(arr_3d)
arr_3d[0]=99
arr_3d
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
array([[[99, 99, 99],
        [99, 99, 99]],

       [[ 7,  8,  9],
        [10, 11, 12]]])
index with slices
print(arr_2d)
arr_2d[1,:2]#第1行 前2列
[[1 2 3]
 [4 5 6]
 [7 8 9]]
array([4, 5])
arr_2d[:2,2]#前两行,第3列
array([3, 6])
arr_2d[:,:1]#所有行 前1列
array([[1],
       [4],
       [7]])
arr_2d[:2,1:]=99
arr_2d
array([[ 1, 99, 99],
       [ 4, 99, 99],
       [ 7,  8,  9]])

boolean indexing

可以通过对名字的索引从而得到一个布尔形式的结果 并应用到后续的操作中

name =np.array(['Bob','Joe','Will','Bob','Will','Joe','Joe'])
data =np.random.randn(7,4)
print(data)
print(name=='Bob')
print(data[name=='Bob'])
print(data[name=='Bob',2:])
print(data[name=='Bob',3])

[[ 0.30105086  0.88459349  0.4694958  -0.132868  ]
 [ 0.36000461  1.1606915  -1.80743514  0.48794727]
 [ 0.12488756  0.55243284  0.39060315  0.53376568]
 [ 1.57220325  0.53586929 -0.71008634  0.32867587]
 [-0.13595332 -0.59006423  0.01553604 -0.65240547]
 [-0.2549401  -0.24149723  0.60159715 -0.22450749]
 [-1.06174782 -1.04370202 -0.14431594 -0.29345469]]
[ True False False  True False False False]
[[ 0.30105086  0.88459349  0.4694958  -0.132868  ]
 [ 1.57220325  0.53586929 -0.71008634  0.32867587]]
[[ 0.4694958  -0.132868  ]
 [-0.71008634  0.32867587]]
[-0.132868    0.32867587]

筛选条件时候可以使用 | 或者是 &

mask=(name=='Bob')|(name=='Will')
mask
array([ True, False,  True,  True,  True, False, False])

通过布尔形式进行赋值

data[data<0]=0
data
array([[0.30105086, 0.88459349, 0.4694958 , 0.        ],
       [0.36000461, 1.1606915 , 0.        , 0.48794727],
       [0.12488756, 0.55243284, 0.39060315, 0.53376568],
       [1.57220325, 0.53586929, 0.        , 0.32867587],
       [0.        , 0.        , 0.01553604, 0.        ],
       [0.        , 0.        , 0.60159715, 0.        ],
       [0.        , 0.        , 0.        , 0.        ]])

fancy indexing

fancy 主要针对的是numpy对整数数组的操作
并将新数据保存在新的数组当中

arr7=np.empty((8,4))
for i in range(8):
    arr7[i]=i
print(arr7)
#按照指定顺序选取数据
print(arr7[[4,3,0,6]])
#负数索引
print(arr7[[-3,-5,-7]])

[[0. 0. 0. 0.]
 [1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [3. 3. 3. 3.]
 [4. 4. 4. 4.]
 [5. 5. 5. 5.]
 [6. 6. 6. 6.]
 [7. 7. 7. 7.]]
[[4. 4. 4. 4.]
 [3. 3. 3. 3.]
 [0. 0. 0. 0.]
 [6. 6. 6. 6.]]
[[5. 5. 5. 5.]
 [3. 3. 3. 3.]
 [1. 1. 1. 1.]]
arr8=np.arange(32).reshape(8,4)
print(arr8)
#指定4个数值位置
print(arr8[[1,5,7,2],[0,3,1,2]])
#读取指定行 列的位置根据要求更改
print(arr8[[1,5,7,2]][:,[0,3,1,2]])
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]
 [24 25 26 27]
 [28 29 30 31]]
[ 4 23 29 10]
[[ 4  7  5  6]
 [20 23 21 22]
 [28 31 29 30]
 [ 8 11  9 10]]
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值