I am looking for a way to concatenate the values in two python dictionaries that contain numpy arrays of arbitrary size whilst avoiding having to manually loop over the dictionary keys. For example:
import numpy as np
# Create first dictionary
n1 = 3
s = np.random.randint(1,101,n1)
n2 = 2
r = np.random.rand(n2)
d = {"r":r,"s":s}
print "d = ",d
# Create second dictionary
n3 = 1
s = np.random.randint(1,101,n3)
n4 = 3
r = np.random.rand(n4)
d2 = {"r":r,"s":s}
print "d2 = ",d2
# Some operation to combine the two dictionaries...
d = SomeOperation(d,d2)
# Updated dictionary
print "d3 = ",d
to give the output
>> d = {'s': array([75, 25, 88]), 'r': array([ 0.1021227 , 0.99454874])}
>> d2 = {'s': array([78]), 'r': array([ 0.27610587, 0.57037473, 0.59876391])}
>> d3 = {'s': array([75, 25, 88, 78]), 'r': array([ 0.1021227 , 0.99454874, 0.27610587, 0.57037473, 0.59876391])}
i.e. so that if the key already exists, the numpy array stored under that key is appended to.
The solution proposed in the previous discussion using the package pandas does not work as it requires arrays having the same length (n1=n2 and n3=n4).
Does anybody know the best way to do this, whilst minimising the use of slow, manual for loops? (I would like to avoid loops because the dictionaries I would like to combine could have hundreds of keys).
Thanks (also to "Aim" for formulating a very clear question)!
解决方案
One way is to go is use a dictionary of Series (i.e. the values are Series rather than arrays):
In [11]: d2
Out[11]: {'r': array([ 0.3536318 , 0.29363604, 0.91307454]), 's': array([46])}
In [12]: d2 = {name: pd.Series(arr) for name, arr in d2.iteritems()}
In [13]: d2
Out[13]:
{'r': 0 0.353632
1 0.293636
2 0.913075
dtype: float64,
's': 0 46
dtype: int64}
That way you can pass it into the DataFrame constructor:
In [14]: pd.DataFrame(d2)
Out[14]:
r s
0 0.353632 46
1 0.293636 NaN
2 0.913075 NaN