https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.GroupBy.apply.html?highlight=apply#pandas.core.groupby.GroupBy.apply
GroupBy.apply(self, func, *args, **kwargs)
对分组进行操作,并将各分组处理结果合并成一个数据框
GroupBy.apply(self, func, *args, **kwargs)
参数 | 描述 |
---|---|
func | callable,可执行的函数,第一个参数是groupby的每个分组的数据框 |
args, kwargs | tuple and dict,该函数的其他参数 |
return | groupby之前的数据框 |
分组,然后排序
import pandas as pd
import numpy as np
df = pd.DataFrame({
'age':[15,16,14,13,17,16],
'gender':["man","woman","man","man","woman","man"]
})
df.groupby('gender').apply(lambda x: x.sort_values('age')).reset_index(drop=True)
def rank(x):
x['rank'] = x['age'].rank(method = 'first',ascending=False)
x = x.sort_values('age')
x = x.set_index('age',drop=False)
x = x.reindex([13,14,15,16,17])
x['age'] = x['age'].interpolate(method='linear',limit_direction='both')
return x
df.groupby('gender').apply(rank)
df.loc["man"]
se = pd.Series([5,np.NaN,np.NaN,2,3,4])
se.interpolate(limit_direction='both',method = 'linear')