Concatenating objects
In [3]: pieces = [df[:3], df[3:7], df[7:]]
In [4]: concatenated = concat(pieces)
In [5]: concatenated
Out[5]:
0 1 2 3
0 0.469112 -0.282863 -1.509059 -1.135632
1 1.212112 -0.173215 0.119209 -1.044236
2 -0.861849 -2.104569 -0.494929 1.071804
3 0.721555 -0.706771 -1.039575 0.271860
4 -0.424972 0.567020 0.276232 -1.087401
5 -0.673690 0.113648 -1.478427 0.524988
6 0.404705 0.577046 -1.715002 -1.039268
7 -0.370647 -1.157892 -1.344312 0.844885
8 1.075770 -0.109050 1.643563 -1.469388
9 0.357021 -0.674600 -1.776904 -0.968914
In [6]: concatenated = concat(pieces, keys=[’first’, ’second’, ’third’])
In [7]: concatenated
Out[7]:
0 1 2 3
first 0 0.469112 -0.282863 -1.509059 -1.135632
1 1.212112 -0.173215 0.119209 -1.044236
2 -0.861849 -2.104569 -0.494929 1.071804
second 3 0.721555 -0.706771 -1.039575 0.271860
4 -0.424972 0.567020 0.276232 -1.087401
5 -0.673690 0.113648 -1.478427 0.524988
6 0.404705 0.577046 -1.715002 -1.039268
third 7 -0.370647 -1.157892 -1.344312 0.844885
8 1.075770 -0.109050 1.643563 -1.469388
9 0.357021 -0.674600 -1.776904 -0.968914
In [8]: concatenated.ix[’second’]
Out[8]:
0 1 2 3
3 0.721555 -0.706771 -1.039575 0.271860
4 -0.424972 0.567020 0.276232 -1.087401
5 -0.673690 0.113648 -1.478427 0.524988
6 0.404705 0.577046 -1.715002 -1.039268
frames = [ process_your_file(f) for f in files ]
result = pd.concat(frames)
In [11]: df
Out[11]:
a b c d
mPXqv -1.294524 0.413738 0.276662 -0.472035
AH4pW -0.013960 -0.362543 -0.006154 -0.923061
c30Fm 0.895717 0.805244 -1.206412 2.565646
3EWtQ 1.431256 1.340309 -1.170299 -0.226169
1gQh9 0.410835 0.813850 0.132003 -0.827317
KQwv8 -0.076467 -1.187678 1.130127 -1.436737
8UDGh -1.413681 1.607920 1.024180 0.569605
KA8Vn 0.875906 -2.211372 0.974466 -2.006747
KDDLI -0.410001 -0.078638 0.545952 -1.219217
yZsRv -1.226825 0.769804 -1.281247 -0.727707
In [12]: concat([df.ix[:7, [’a’, ’b’]], df.ix[2:-2, [’c’]],
....: df.ix[-7:, [’d’]]], axis=1)
....:
Out[12]:
a b c d
1gQh9 0.410835 0.813850 0.132003 -0.827317
3EWtQ 1.431256 1.340309 -1.170299 -0.226169
8UDGh -1.413681 1.607920 1.024180 0.569605
AH4pW -0.013960 -0.362543 NaN NaN
KA8Vn NaN NaN 0.974466 -2.006747
KDDLI NaN NaN NaN -1.219217
KQwv8 -0.076467 -1.187678 1.130127 -1.436737
c30Fm 0.895717 0.805244 -1.206412 NaN
mPXqv -1.294524 0.413738 NaN NaN
yZsRv NaN NaN NaN -0.727707
In [13]: concat([df.ix[:7, [’a’, ’b’]], df.ix[2:-2, [’c’]],
....: df.ix[-7:, [’d’]]], axis=1, join=’inner’)
....:
Out[13]:
a b c d
3EWtQ 1.431256 1.340309 -1.170299 -0.226169
1gQh9 0.410835 0.813850 0.132003 -0.827317
KQwv8 -0.076467 -1.187678 1.130127 -1.436737
8UDGh -1.413681 1.607920 1.024180 0.569605
In [14]: concat([df.ix[:7, [’a’, ’b’]], df.ix[2:-2, [’c’]],
....: df.ix[-7:, [’d’]]], axis=1, join_axes=[df.index])
....:
Out[14]:
a b c d
mPXqv -1.294524 0.413738 NaN NaN
AH4pW -0.013960 -0.362543 NaN NaN
c30Fm 0.895717 0.805244 -1.206412 NaN
3EWtQ 1.431256 1.340309 -1.170299 -0.226169
1gQh9 0.410835 0.813850 0.132003 -0.827317
KQwv8 -0.076467 -1.187678 1.130127 -1.436737
8UDGh -1.413681 1.607920 1.024180 0.569605
KA8Vn NaN NaN 0.974466 -2.006747
KDDLI NaN NaN NaN -1.219217
yZsRv NaN NaN NaN -0.727707
Concatenating using append
A useful shortcut to concat are the append instance methods on Series and DataFrame. These methods actually predated concat. They concatenate along axis=0, namely the index
In [22]: df1
Out[22]:
A B C D
2000-01-01 0.176444 0.403310 -0.154951 0.301624
2000-01-02 -2.179861 -1.369849 -0.954208 1.462696
2000-01-03 -1.743161 -0.826591 -0.345352 1.314232
In [23]: df2
Out[23]:
A B C
2000-01-04 0.690579 0.995761 2.396780
2000-01-05 3.357427 -0.317441 -1.236269
2000-01-06 -0.487602 -0.082240 -2.182937
In [24]: df1.append(df2)
Out[24]:
A B C D
2000-01-01 0.176444 0.403310 -0.154951 0.301624
2000-01-02 -2.179861 -1.369849 -0.954208 1.462696
2000-01-03 -1.743161 -0.826591 -0.345352 1.314232
2000-01-04 0.690579 0.995761 2.396780 NaN
2000-01-05 3.357427 -0.317441 -1.236269 NaN
2000-01-06 -0.487602 -0.082240 -2.182937 NaN
Ignoring indexes on the concatenation axis
In [33]: concat([df1, df2], ignore_index=True)
In [34]: df1.append(df2, ignore_index=True)
More concatenating with group keys
In [43]: pieces = [df.ix[:, [0, 1]], df.ix[:, [2]], df.ix[:, [3]]]
In [44]: result = concat(pieces, axis=1, keys=[’one’, ’two’, ’three’])
In [45]: result
Out[45]:
one two three
0 1 2 3
0 -0.014805 -0.284319 0.650776 -1.461665
1 -1.137707 -0.891060 -0.693921 1.613616
2 0.464000 0.227371 -0.496922 0.306389
3 -2.290613 -1.134623 -1.561819 -0.260838
4 0.281957 1.523962 -0.902937 0.068159
5 -0.057873 -0.368204 -1.144073 0.861209
6 0.800193 0.782098 -1.069094 -1.099248
7 0.255269 0.009750 0.661084 0.379319
8 -0.008434 1.952541 -1.056652 0.533946
9 -1.226970 0.040403 -0.507516 -0.230096
Database-style DataFrame joining/merging
left Use keys from left frame only
right Use keys from right frame only
outer Use union of keys from both frames
inner Use intersection of keys from both frame