pandas之DataFrame方法与函数_python dataframe 查询大于小于-CSDN博客

本文链接：https://blog.csdn.net/m0_46926492/article/details/124316537

DataFrame方法与函数

1. DataFrame相关方法
2. 聚合函数

1. DataFrame相关方法

1.1 where()

where:满足cond条件的保留，其他用other填充

相关操作	代码	结果
查找df大于60的,小于60的填充"不及格"	`df.where(cond=df>60,other='不及格')`
A列大于60的不变,小于60的将一行都填充为其他	`df.where(cond=df['A']>60,other='其他')`
A列小于60填充100，B列小于60填充200,其他默认填NaN	`df.where(df>60,other=pd.Series(data=[100,200],index=['A','B']),axis=1)`
1行小于60填充100，2行小于60填充200,其他默认填NaN	`df.where(df>60,other=pd.Series(data=[100,200],index=[1,2]),axis=0)`

代码

df=pd.DataFrame(data=np.random.randint(0,100,size=(5,4)),columns=list('ABCD'))
df.where(cond=df>60,other='不及格')
df.where(cond=df['A']>60,other='其他')
 df.where(df>60,other=pd.Series(data=[100,200],index=['A','B']),axis=1) 
 df.where(df>60,other=pd.Series(data=[100,200],index=[1,2]),axis=0)

1.2 mask()

mask:不满足con条件的保留，其他用other填充

相关操作	代码	结果
大于60填充为及格	`df.mask(df>60,other='及格')`
A列符合条件填充100，B列符合条件填充200，C列符合条件填充300，D列符合条件填充400	`df.mask(df>60,other=pd.Series(data=[100,200,300,400],index=['A','B','C','D']),axis=1)`

代码

df=pd.DataFrame(data=np.random.randint(0,100,size=(5,4)),columns=list('ABCD'))
df.mask(df>60,other='及格')
df.mask(df>60,other=pd.Series(data=[100,200,300,400],index=['A','B','C','D']),axis=1)

1.3 query()

query()方法:条件查询

相关操作代码结果
获取A大于60,B大于60的行 df.query('A>60 and B>60')
获取B等于63的行 df.query('B==71')

相关操作	代码	结果
获取`A大于60,B大于60的行`	`df.query('A>60 and B>60')`
获取`B等于63`的行	`df.query('B==71')`

代码

df=pd.DataFrame(data=np.random.randint(0,100,size=(5,4)),columns=list('ABCD'))
df.query('A>60 and B>60')
df.query('B==71')

1.4 filter()

filter()方法
- 对行名和列名进行筛选
- 使用item选择行，列名称，配合axis
- 使用regex正则表达式，模糊匹配名称
  
  相关操作代码结果
  获取0,1行 df.filter(items=[0,1],axis=0)
  获取列索引包含B df.filter(regex='B',axis=1)
  获取以B开头的列名称 df.filter(regex='^B',axis=1)

相关操作	代码	结果
获取0,1行	`df.filter(items=[0,1],axis=0)`
获取列索引包含B	`df.filter(regex='B',axis=1)`
获取以B开头的列名称	`df.filter(regex='^B',axis=1)`

代码

df=pd.DataFrame(data=np.random.randint(0,100,size=(5,4)),columns=['AA','BB','CB','DB'])
df.filter(items=[0,1],axis=0)
df.filter(regex='B',axis=1)
df.filter(regex='^B',axis=1)

2. 聚合函数

函数	描述
`count`	不同值的个数
`sum`	求和
`mean`	均值
`mad`	绝对值均值
`median`	中位数
`min`	最小值
`max`	最大值
`mode`	众数
`abs`	绝对值
`prod`	乘积
`std`	标准差
`var`	方差
`quantile`	分位数
`cumsum`	累和
`cumprod`	累积