Dataframe之join,merge的使用

 

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: df1 = pd.DataFrame(np.ones((2,4)),columns=list("abcd"), index=list("AB"))

In [4]: df1
Out[4]:
     a    b    c    d
A  1.0  1.0  1.0  1.0
B  1.0  1.0  1.0  1.0

In [5]: df2 = pd.DataFrame(np.zeros((3,3)), columns=list("xyz"), index=list("ABC"))

In [6]: df2
Out[6]:
     x    y    z
A  0.0  0.0  0.0
B  0.0  0.0  0.0
C  0.0  0.0  0.0

join就是按照行索引进行相应的合并 

In [7]: df1.join(df2)
Out[7]:
     a    b    c    d    x    y    z
A  1.0  1.0  1.0  1.0  0.0  0.0  0.0
B  1.0  1.0  1.0  1.0  0.0  0.0  0.0

In [8]: df2.join(df1)
Out[8]:
     x    y    z    a    b    c    d
A  0.0  0.0  0.0  1.0  1.0  1.0  1.0
B  0.0  0.0  0.0  1.0  1.0  1.0  1.0
C  0.0  0.0  0.0  NaN  NaN  NaN  NaN

merge:按照指定的列把数据按照一定的方式合并

 

In [41]: df3 = pd.DataFrame(np.arange(9).reshape(3,3), columns=list("fax"))

In [42]: df3
Out[42]:
   f  a  x
0  0  1  2
1  3  4  5
2  6  7  8

In [44]: df1 = pd.DataFrame(np.ones((2,4)), columns=list("abcd"), index=list("A
    ...: B"))

In [45]: df1
Out[45]:
     a    b    c    d
A  1.0  1.0  1.0  1.0
B  1.0  1.0  1.0  1.0

In [47]: df1.merge(df3, on="a")
Out[47]:
     a    b    c    d  f  x
0  1.0  1.0  1.0  1.0  0  2
1  1.0  1.0  1.0  1.0  0  2




In [49]: df1.loc["A", "a"] = 100

In [50]: df1
Out[50]:
       a    b    c    d
A  100.0  1.0  1.0  1.0
B    1.0  1.0  1.0  1.0

In [51]: df1.merge(df3, on="a")
Out[51]:
     a    b    c    d  f  x
0  1.0  1.0  1.0  1.0  0  2
按照列进行合并,df1,和df3这"a"这一列相同的值只有1,所以得到上述的结果

In [52]: df3
Out[52]:
   f  a  x
0  0  1  2
1  3  4  5
2  6  7  8




In [53]: df1
Out[53]:
       a    b    c    d
A  100.0  1.0  1.0  1.0
B    1.0  1.0  1.0  1.0

In [54]: df3
Out[54]:
   f  a  x
0  0  1  2
1  3  4  5
2  6  7  8

默认的是内连接
In [55]: df1.merge(df3, on="a", how="inner")
Out[55]:
     a    b    c    d  f  x
0  1.0  1.0  1.0  1.0  0  2




In [56]: df3
Out[56]:
   f  a  x
0  0  1  2
1  3  4  5
2  6  7  8

In [57]: df1
Out[57]:
       a    b    c    d
A  100.0  1.0  1.0  1.0
B    1.0  1.0  1.0  1.0


外连接

In [58]: df1.merge(df3, on="a", how="outer")
Out[58]:
       a    b    c    d    f    x
0  100.0  1.0  1.0  1.0  NaN  NaN
1    1.0  1.0  1.0  1.0  0.0  2.0
2    4.0  NaN  NaN  NaN  3.0  5.0
3    7.0  NaN  NaN  NaN  6.0  8.0





In [59]: df1
Out[59]:
       a    b    c    d
A  100.0  1.0  1.0  1.0
B    1.0  1.0  1.0  1.0

In [60]: df3
Out[60]:
   f  a  x
0  0  1  2
1  3  4  5
2  6  7  8

左连接
In [61]: df1.merge(df3, on="a", how="left")
Out[61]:
       a    b    c    d    f    x
0  100.0  1.0  1.0  1.0  NaN  NaN
1    1.0  1.0  1.0  1.0  0.0  2.0
右连接

In [62]: df1.merge(df3, on="a", how="right")
Out[62]:
     a    b    c    d  f  x
0  1.0  1.0  1.0  1.0  0  2
1  4.0  NaN  NaN  NaN  3  5
2  7.0  NaN  NaN  NaN  6  8

 

 

默认的合并方式inner交集

merge outerNaN补全

merge left左边为准,NaN补全

merge right右边为准,NaN补全

 

已标记关键词 清除标记
相关推荐
©️2020 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页