pd.concat,pd.merge测试记录

总结,
concat和merge都是默认纵向拼接。
pd.concat默认取并集
并且索引保留原名称(即使会重复)
交集时保留相同的列,并集时不相同的列也保留,没有的index用Nan填充。
pd.merge默认取交集
交集时只保留index,column完全相同的。
并且索引会自动替换成从0开始的数字索引,不论是否重复。
可使用left_index,right_index=True来设置使用某一侧DataFrame中的索引作为连接键。纵向连接的时候设置两者都为True来保留原有index。

a=pd.DataFrame(np.arange(20).reshape(5,4))
a
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15
4 16 17 18 19

b=pd.DataFrame(np.arange(20,40).reshape(5,4))
b
0 1 2 3
0 20 21 22 23
1 24 25 26 27
2 28 29 30 31
3 32 33 34 35
4 36 37 38 39

pd.merge(a,b)
Empty DataFrame
Columns: [0, 1, 2, 3]
Index: []

pd.merge(a,b,how=‘outer’)
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15
4 16 17 18 19
5 20 21 22 23
6 24 25 26 27
7 28 29 30 31
8 32 33 34 35
9 36 37 38 39

pd.merge(a,b,how=‘inner’)
Empty DataFrame
Columns: [0, 1, 2, 3]
Index: []

pd.concat([a,b],how=‘inner’)
Traceback (most recent call last):
File “”, line 1, in
TypeError: concat() got an unexpected keyword argument ‘how’

pd.concat([a,b],join=‘outer’)
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15
4 16 17 18 19
0 20 21 22 23
1 24 25 26 27
2 28 29 30 31
3 32 33 34 35
4 36 37 38 39

pd.concat([a,b],join=‘inner’)
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15
4 16 17 18 19
0 20 21 22 23
1 24 25 26 27
2 28 29 30 31
3 32 33 34 35
4 36 37 38 39

b.columns=[2,3,4,5]
pd.concat([a,b])
0 1 2 3 4 5
0 0.0 1.0 2 3 NaN NaN
1 4.0 5.0 6 7 NaN NaN
2 8.0 9.0 10 11 NaN NaN
3 12.0 13.0 14 15 NaN NaN
4 16.0 17.0 18 19 NaN NaN
0 NaN NaN 20 21 22.0 23.0
1 NaN NaN 24 25 26.0 27.0
2 NaN NaN 28 29 30.0 31.0
3 NaN NaN 32 33 34.0 35.0
4 NaN NaN 36 37 38.0 39.0

pd.concat([a,b],join=‘inner’)
2 3
0 2 3
1 6 7
2 10 11
3 14 15
4 18 19
0 20 21
1 24 25
2 28 29
3 32 33
4 36 37

pd.merge(a,b)
Empty DataFrame
Columns: [0, 1, 2, 3, 4, 5]
Index: []

pd.merge(a,b,how=‘outer’)
0 1 2 3 4 5
0 0.0 1.0 2 3 NaN NaN
1 4.0 5.0 6 7 NaN NaN
2 8.0 9.0 10 11 NaN NaN
3 12.0 13.0 14 15 NaN NaN
4 16.0 17.0 18 19 NaN NaN
5 NaN NaN 20 21 22.0 23.0
6 NaN NaN 24 25 26.0 27.0
7 NaN NaN 28 29 30.0 31.0
8 NaN NaN 32 33 34.0 35.0
9 NaN NaN 36 37 38.0 39.0

a.index=[0,1,2,3,4]
a
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15
4 16 17 18 19

pd.merge(a,b,how=‘outer’)
0 1 2 3 4 5
0 0.0 1.0 2 3 NaN NaN
1 4.0 5.0 6 7 NaN NaN
2 8.0 9.0 10 11 NaN NaN
3 12.0 13.0 14 15 NaN NaN
4 16.0 17.0 18 19 NaN NaN
5 NaN NaN 20 21 22.0 23.0
6 NaN NaN 24 25 26.0 27.0
7 NaN NaN 28 29 30.0 31.0
8 NaN NaN 32 33 34.0 35.0
9 NaN NaN 36 37 38.0 39.0

pd.merge(a,b,how=‘outer’,left_index=True)
Traceback (most recent call last):
File “”, line 1, in
File “/share/data1/fengjie/software/miniconda3/envs/RGI_new/lib/python3.6/site-packages/pandas/core/reshape/merge.py”, line 86, in merge
validate=validate,
File “/share/data1/fengjie/software/miniconda3/envs/RGI_new/lib/python3.6/site-packages/pandas/core/reshape/merge.py”, line 620, in init
self._validate_specification()
File “/share/data1/fengjie/software/miniconda3/envs/RGI_new/lib/python3.6/site-packages/pandas/core/reshape/merge.py”, line 1183, in _validate_specification
raise MergeError(“Must pass right_on or right_index=True”)
pandas.errors.MergeError: Must pass right_on or right_index=True

pd.merge(a,b,how=‘outer’,right_index=True)
Traceback (most recent call last):
File “”, line 1, in
File “/share/data1/fengjie/software/miniconda3/envs/RGI_new/lib/python3.6/site-packages/pandas/core/reshape/merge.py”, line 86, in merge
validate=validate,
File “/share/data1/fengjie/software/miniconda3/envs/RGI_new/lib/python3.6/site-packages/pandas/core/reshape/merge.py”, line 620, in init
self._validate_specification()
File “/share/data1/fengjie/software/miniconda3/envs/RGI_new/lib/python3.6/site-packages/pandas/core/reshape/merge.py”, line 1186, in _validate_specification
raise MergeError(“Must pass left_on or left_index=True”)
pandas.errors.MergeError: Must pass left_on or left_index=True

pd.merge(a,b,how=‘outer’,left_index=True,right_index=True)
0 1 2_x 3_x 2_y 3_y 4 5
0 0.0 1.0 2.0 3.0 NaN NaN NaN NaN
1 4.0 5.0 6.0 7.0 NaN NaN NaN NaN
2 8.0 9.0 10.0 11.0 NaN NaN NaN NaN
3 12.0 13.0 14.0 15.0 NaN NaN NaN NaN
4 16.0 17.0 18.0 19.0 NaN NaN NaN NaN
e NaN NaN NaN NaN 28.0 29.0 30.0 31.0
q NaN NaN NaN NaN 20.0 21.0 22.0 23.0
r NaN NaN NaN NaN 32.0 33.0 34.0 35.0
t NaN NaN NaN NaN 36.0 37.0 38.0 39.0
w NaN NaN NaN NaN 24.0 25.0 26.0 27.0

pd.merge(a,b,how=‘inner’,left_index=True,right_index=True)
Empty DataFrame
Columns: [0, 1, 2_x, 3_x, 2_y, 3_y, 4, 5]
Index: []

pd.merge(a,a)
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15
4 16 17 18 19

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值