spark sql的join方法

楓尘林间

于 2019-08-07 13:57:13 发布

阅读量911

点赞数

分类专栏： Spark SQL

本文链接：https://blog.csdn.net/bowenlaw/article/details/98743372

版权

join(other, on=None, how=None)[source]

Joins with another DataFrame, using the given join expression.

Parameters

        other – Right side of the join

        on – a string for the join column name, a list of column names, a join expression (Column), or a list of Columns. If on is a string or a list of strings indicating the name of the join column(s), the column(s) must exist on both sides, and this performs an equi-join.
        on - 用于连接列名称的字符串，列名列表，连接表达式（列）或列列表。 如果on是字符串或指示连接列名称的字符串列表，则列必须存在于两侧，并执行等连接。

        how – str, default inner. Must be one of: inner, cross, outer, full, full_outer, left, left_outer, right, right_outer, left_semi, and left_anti.

The following performs a full outer join between df1 and df2.

>>> df.join(df2, df.name == df2.name, 'outer').select(df.name, df2.height).collect()
[Row(name=None, height=80), Row(name='Bob', height=85), Row(nam