一、Series
相比于numpy的ndarray,pandas的Series拥有更多的函数,比如describe()函数
s1 = pd.Series([1,2,3,4,5,6])
print(s1.describe())
输出:
count 6.000000
mean 3.500000
std 1.870829
min 1.000000
25% 2.250000
50% 3.500000
75% 4.750000
max 6.000000
可用下标访问:s1[0]
可for循环:for item in s1
可向量计算、比较运算,运算规则同ndarray:+、-、*、/、**、>、<、>=、<=、!=、==
可科学计算:mean、sum、max、min
运算比list快
附:Series中为True、False时,也可做+、-运算
True为1,False为0
def dir_judge(var1,var2):
mean1 = var1.mean()
mean2 = var2.mean()
same_dir = (((var1 > mean1)&(var2 > mean2))|((var1 < mean1)&(var2 < mean2)))
print(type(same_dir))
print(same_dir)
return(same_dir.sum(),len(var1)-same_dir.sum())#sum方法求和
house_area = pd.Series([67.5,32,135,84,200,62,101,25])
house_price = pd.Series([550,268,850,652,1300,906,1100,400])
print(dir_judge(house_area,house_price))
可加索引:
pd.Series([ ] , index = [ ])
scores = pd.Series([81,90,57,100],
index = ['Guo Jing','Huang Rong','Xiao Longnv','Yang Guo'])
print(scores)
输出:
Guo Jing 81
Huang Rong 90
Xiao Longnv 57
Yang Guo 100
dtype: int64
describe():输出Series的一些特征,如最大值,最小值,均值等
loc[ ]:通过定义的index定位元素
iloc[ ]:通过位置(0,1,2)定位元素
Series[位置]:同iloc[]
idxmax():返回最大元素对应的index
idxmin():返回最小元素对应的index