pandas将一列拆分为3列_Pandas,DataFrame:将一列拆分为多列

I have the following DataFrame. I am wondering whether it is possible to break the data column into multiple columns. E.g., from this:

ID Date data

6 21/05/2016 A: 7, B: 8, C: 5, D: 5, A: 8

6 21/01/2014 B: 5, C: 5, D: 7

6 02/04/2013 A: 4, D:7

7 05/06/2014 C: 25

7 12/08/2014 D: 20

8 18/04/2012 A: 2, B: 3, C: 3, E: 5, B: 4

8 21/03/2012 F: 6, B: 4, F: 5, D: 6, B: 4

into this:

ID Date data A B C D E F

6 21/05/2016 A: 7, B: 8, C: 5, D: 5, A: 8 15 8 5 5 0 0

6 21/01/2014 B: 5, C: 5, D: 7 0 5 5 7 0 0

6 02/04/2013 B: 4, D: 7, B: 6 0 10 0 7 0 0

7 05/06/2014 C: 25 0 0 25 0 0 0

7 12/08/2014 D: 20 0 0 0 20 0 0

8 18/04/2012 A: 2, B: 3, C: 3, E: 5, B: 4 2 7 3 0 5 0

8 21/03/2012 F: 6, B: 4, F: 5, D: 6, B: 4 0 8 0 6 0 11

EDIT

There is a bit of complexity the data column has duplicate values for example in first row A is repeated, and therefore these values are summed up under the A column (please see second table).

解决方案

Here is a function that can convert the string to a dictionary and aggregate values based on the key; After the conversion it will be easy to get the results with the pd.Series method:

def str_to_dict(str1):

import re

from collections import defaultdict

d = defaultdict(int)

for k, v in zip(re.findall('[A-Z]', str1), re.findall('\d+', str1)):

d[k] += int(v)

return d

pd.concat([df, df['dictionary'].apply(str_to_dict).apply(pd.Series).fillna(0).astype(int)], axis=1)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值