(1)ndarray
创建array:import numpy as pd aArray=pd.array([(1,2,3),(4,5,6),(7,8,9)]) bArray=pd.array([1,2,4]) #aArray若命名为[(1,2,3),(4,5)]则不能创建成功 aArray Out[15]: array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) bArray Out[16]: array([1, 2, 4])
pd.arange(1,10) Out[18]: array([1, 2, 3, 4, 5, 6, 7, 8, 9]) pd.random.random((3,2)) Out[19]: array([[ 0.94750778, 0.93591796], [ 0.04284112, 0.51099218], [ 0.75403804, 0.21734349]])#创建一个N x M的随机数矩阵
pd.linspace(1,2,20,endpoint=False) Out[24]: array([ 1. , 1.05, 1.1 , 1.15, 1.2 , 1.25, 1.3 , 1.35, 1.4 , 1.45, 1.5 , 1.55, 1.6 , 1.65, 1.7 , 1.75, 1.8 , 1.85, 1.9 , 1.95])#创建一个等差数列,如要为整形数 ,则加上dtype=int
list(range(1,50,2))与pd.linspace(1,49,2,dtype=int)作用相等,皆为创造一等差数列,
pd.fromfunction(lambda x,y:(x+1)*(y+1),(5,5),dtype=int) Out[29]: array([[ 1, 2, 3, 4, 5], [ 2, 4, 6, 8, 10], [ 3, 6, 9, 12, 15], [ 4, 8, 12, 16, 20], [ 5, 10, 15, 20, 25]])#创建一个乘法口诀表
shape(),reshape(),resize(),用与改变行列,加减等运算,有特殊的一点:aArray=pd.array([1,2,3]) bArray=pd.array([(4,5,6),(7,8,9)]) aArray+bArray Out[36]: array([[ 5, 7, 9], [ 8, 10, 12]])#每个相加
若aArray=(1,2),则不能与bArray相加,(广播的思想)
sum(axis=0 /axis=1),min(),max(),var()方差,std()标准差
(2)创建DataFrame
import numpy as np import pandas as pd frame=pd.dataframe(data,index=range(1,4),columns=['name','pay']) frame Out: name pay 1 Dave 400 2 Vera 500 3 Jane 300 frame.index Out: RangeIndex(start=1, stop=4, step=1) frame.name Out: 1 Dave 2 Vera 3 Jane Name: name, dtype: object
(3)基本操作:查看数据
查看首尾数据
frame.tail(2) Out: name pay 2 Vera 500 3 Jane 300 frame.head(2) Out: name pay 1 Dave 400 2 Vera 500
查看大于200的数据
frame[frame.pay>='200']
describe的使用方法:
frame.describe Out: <bound method NDFrame.describe of name pay 1 Dave 400 2 Vera 500 3 Jane 300> frame.describe() Out: name pay count 3 3 unique 3 3 top Jane 500 freq 1 1
frame.iloc[:2,1]#查看某行某列
frame.pay.min()#查找最小值
删除:frame.drop(frame.index[3],inplace=True)#删除第三行,inplace=True,返回改变原值
frame.drop([1],axis=1,inplace=True)