如何使用python处理数据写入另一列_如何基于另一列的特定值对python数据框进行操作?...

I am new to python data analysis. Following is an example dataset:

d2 = {'Index': [0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1], 'journey_time':[95.546,132.945,147.538,301.307,42.907,129.008,102.900,112.620,234.334,103.321,82.337,154.817,20.076,85.717,94.362,45.032],'edge':['s_b','c_d','b_d','c_e','d_f','s_a','a_c','d_c','c_e','a_c','d_c','s_a','d_f','s_b','b_d','c_d']}

df2=pd.DataFrame(data=d2)

I want to create a new data frame where there is one row for each index with new columns. The rules for the new columns are as such:

se1 = s_a + a_c + c_e

se2 = s_b + b_d + d_c + c_e

sf1 = s_b + b_d + d_f

sf2 = s_a + a_c + c_d + d_f

Also, I have further variations in my calculations such as

eq_time1 = (200/(s_a + a_c)) + c_e

eq_time2 = (200/(s_b + b_d + d_c)) + c_e

The values of the edges in the rules are the corresponding journey time for each unique index. I am not sure how to write this in python dataframe. Following is my expected output:

df3 = {'Index':[0,1],'se1':[129.008+102.900+301.307,154.817+103.321+234.334],'se2':[95.546+147.538+112.620+301.307,85.717+94.362+82.337+234.334],'sf1':[95.546+147.538+42.907,85.717+94.362+20.076],'sf2':[129.008+102.900+132.945+42.907,154.817+103.321+45.032+20.076 ],'eq_time1':[(200/(129.008+102.900))+301.307,(200/(154.817+103.321))+234.334 ], 'eq_time2' : [(200/(95.546+147.538+112.620))+301.307,(200/(85.717+94.362+82.337))+234.334]}

Please help!

解决方案

If you have just those 4 paths in your data, you can calculate the times in pandas as follows:

paths = {

'se1': ['s_a', 'a_c', 'c_e'],

'se2': ['s_b', 'b_d', 'd_c', 'c_e'],

'sf1': ['s_b', 'b_d', 'd_f'],

'sf2': ['s_a', 'a_c', 'c_d', 'd_f']

}

paths = {

'se1': ['s_a', 'a_c', 'c_e'],

'se2': ['s_b', 'b_d', 'd_c', 'c_e'],

'sf1': ['s_b', 'b_d', 'd_f'],

'sf2': ['s_a', 'a_c', 'c_d', 'd_f']

}

df3 = pd.DataFrame({'Index': df2['Index'].unique()}).set_index('Index')

for k, v in paths.items():

df3[k] = df2[df2.edge.isin(v)].groupby('Index')['journey_time'].sum()

last_edge_times = df2[df2.edge==v[-1]].set_index('Index')

df3['eq_time_'+k] = 200.0/(df3[k] - last_edge_times.journey_time) + last_edge_times.journey_time

For any path p, eq_time_p column stores the eq_times as given by your equations.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值