当两个RDD的数据类型为二元组Key/Value对时,可以依据Key进行关联Join
x = sc.parallelize([(1001, "zhangsan"), (1002, "lisi"), (1003, "wangwu"), (1004, "zhangliu")])
y = sc.parallelize([(1001, "sales"), (1002, "tech")])
# ==========join==========
joined = x.join(y)
# ==========leftOuterJoin==========
leftOuterjoined = x.leftOuterJoin(y)
# ==========rightOuterJoin==========
rightOuterjoined = x.rightOuterJoin(y)