我正在尝试编写一个函数来汇总并在Pandas中的数据框上执行各种统计数据计算,然后将其合并到原始数据框,但是,我遇到了问题。这与SQL中的代码等效:
SELECT EID,
PCODE,
SUM(PVALUE) AS PVALUE,
SUM(SQRT(SC*EXP(SC-1))) AS SC,
SUM(SI) AS SI,
SUM(EE) AS EE
INTO foo_bar_grp
FROM foo_bar
GROUP BY EID, PCODE
然后加入原始表:
SELECT *
FROM foo_bar_grp INNER JOIN
foo_bar ON foo_bar.EID = foo_bar_grp.EID
AND foo_bar.PCODE = foo_bar_grp.PCODE
步骤如下:将数据加载到 :>>
pol_dict = {'PID':[1,1,2,2],
'EID':[123,123,123,123],
'PCODE':['GU','GR','GU','GR'],
'PVALUE':[100,50,150,300],
'SI':[400,40,140,140],
'SC':[230,23,213,213],
'EE':[10000,10000,2000,30000],
}
pol_df = DataFrame(pol_dict)