1. 聚类:
group by:将相同时间(key1)里面的因子(data1)进行聚合
df.groupby('key1')['data1'].mean()
df['data1'].groupby(df['key1']).mean()
df['rank_g'] = df.groupby(['gender'])['age'].rank()
通过字典分组:
mapping = {'香蕉':'水果','苹果':'水果','橘子':'水果','眼影':'化妆品','眼线':'化妆品'}
data = people.groupby(mapping,axis=1).mean()
多列找同类项:
idc_sec2 = idc_sec1.groupby([ '指数代码','日期', 'sec_name_collect'])["市值"].sum()
2. 进行自定义函数:
自定义函数:
def quantile25(x):
return x.quantile(0.25)
对index重命名:
indus_pos_new.index = indus_pos_new.index.strftime("%Y").rename("time")
多变量聚类:
stat_all_frame = indus_pos_new.groupby('time').agg(["mean", "max", "min", "median", \
quantile25])