我有几个元组的
Python列表:
[(0, 61), (1, 30), (5, 198), (4, 61), (0, 30), (5, 200)]
[(1, 72), (2, 19), (3, 31), (4, 192), (6, 72), (5, 75)]
[(3, 12), (0, 51)]
...
创建这些元组中的每一个都使得它们的格式(键,值):
有七个键:0,1,2,3,4,5,6
目标输出是一个pandas DataFrame,每个列都由键命名:
import pandas as pd
print(df)
0 1 2 3 4 5 6
91 30 0 0 61 198 0
0 72 19 31 192 75 72
51 0 0 12 0 0 0
现在,我在概念上的问题是如果它们的键是相同的,那么如何添加几个元组“值”.
我可以访问给定列表的这些值,例如
mylist = [(0, 61), (1, 30), (5, 198), (4, 61), (0, 30), (5, 200)]
keys = [x[0] for x in mylist]
和
print(keys)
[0, 1, 5, 4, 0, 5]
我不确定如何创建,例如密钥字典:值对,我可以加载到pandas DataFrame中
您可以先使用groupby为每个元素按键求和,然后使用pandas转换为dataframe.请注意,您必须在求和之前先按键排序.
import pandas as pd
from itertools import groupby
data = [
[(0, 61), (1, 30), (5, 198), (4, 61), (0, 30), (5, 200)],
[(1, 72), (2, 19), (3, 31), (4, 192), (6, 72), (5, 75)],
[(0, 71), (1, 40), (5, 98), (4, 21), (0, 10), (5, 21200)],
[(1, 702), (2, 190), (3, 310), (4, 1092), (6, 702), (5, 705)],
] # copying example from @PatrickArtnerz solution
def group_sum(data):
"""given list, return dictionary of summation based on initial key"""
data_dict = {k: sum(v_[1] for v_ in v) for k, v in groupby(sorted(data, key=lambda x: x[0]), lambda x: x[0])}
return data_dict
df = pd.DataFrame(list(map(group_sum, data))).fillna(0)