数据规整之数据连接pd.merge/数据合并data.join/数据连接pd.concat/数据重塑stack

数据合并(pd.merge)

  • 根据单个或多个键将不同DataFrame的行连接起来

  • 类似数据库的连接操作

  • pd.merge:(left, right, how='inner',on=None,left_on=None, right_on=None )

    left:合并时左边的DataFrame

    right:合并时右边的DataFrame

    how:合并的方式,默认'inner', 'outer', 'left', 'right'

  • alll=pd.merge(left,right,on='地区',how='left')#左连接----left对所有左表的键进行联合
    allr=pd.merge(left,right,on='地区',how='right')#右连接----right对所有右表的键进行联合
    alli=pd.merge(left,right,on='地区',how='inner')#内连接----inner:对两张表都有的键的交集进行联合
    allo=pd.merge(left,right,on='地区',how='outer')#全连接----outer:对两者表的都有的键的并集进行联合

  • on:需要合并的列名,必须两边都有的列名,并以 left 和 right 中的列名的交集作为连接键

  • left_on: left Dataframe中用作连接键的列

  • right_on: right Dataframe中用作连接键的列

https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html官方文档
pd.merge(
    left,
    right,
    how="inner",
    on=None,
    left_on=None,
    right_on=None,
    left_index=False,
    right_index=False,
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
)
  • left: A DataFrame or named Series object.

  • right: Another DataFrame or named Series object.

  • on: Column or index level names to join on. Must be found in both the left and right DataFrame and/or Series objects. If not passed and left_index and right_index are False, the intersection of the columns in the DataFrames and/or Series will be inferred to be the join keys.

  • left_on: Columns or index levels from the left DataFrame or Series to use as keys. Can either be column names, index level names, or arrays with length equal to the length of the DataFrame or Series.

  • right_on: Columns or index levels from the right DataFrame or Series to use as keys. Can either be column names, index level names, or arrays with length equal to the length of the DataFrame or Series.

  • left_index: If True, use the index (row labels) from the left DataFrame or Series as its join key(s). In the case of a DataFrame or Series with a MultiIndex (hierarchical), the number of levels must match the number of join keys from the right DataFrame or Series.

  • right_index: Same usage as left_index for the right DataFrame or Series

  • how: One of 'left', 'right', 'outer', 'inner'. Defaults to inner. See below for more detailed description of each method.

  • sort: Sort the result DataFrame by the join keys in lexicographical order. Defaults to True, setting to False will improve performance substantially in many cases.

  • suffixes: A tuple of string suffixes to apply to overlapping columns. Defaults to ('_x', '_y').

  • copy: Always copy data (default True) from the passed DataFrame or named Series objects, even when reindexing is not necessary. Cannot be avoided in many cases but may improve performance / memory usage. The cases where copying can be avoided are somewhat pathological but this option is provided nonetheless.

  • indicator: Add a column to the output DataFrame called _merge with information on the source of each row. _merge is Categorical-type and takes on a value of left_only for observations whose merge key only appears in 'left' DataFrame or Series, right_only for observations whose merge key only appears in 'right' DataFrame or Series, and both if the observation’s merge key is found in both.

  • validate : string, default None. If specified, checks if merge is of specified type.

    • “one_to_one” or “1:1”: checks if merge keys are unique in both left and right datasets.

    • “one_to_many” or “1:m”: checks if merge keys are unique in left dataset.

    • “many_to_one” or “m:1”: checks if merge keys are unique in right dataset.

    • “many_to_many” or “m:m”: allowed, but does not result in checks.

 

 

 

 

 

 

Join

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值