spark单表关联
题目:求孙子和祖父母的关系列表
数据:
child parent
Tom Lucy
Tom Jack
Jone Lucy
Jone Jack
Lucy Mary
Lucy Ben
Jack Alice
Jack Jesse
Terry Alice
Terry Jesse
Philip Terry
Philip Alma
Mark Terry
Mark Alma
spark代码:
import org.apache.spark.{SparkConf, SparkContext} object danbiaoRelation { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("paixu").setMaster("local") val sc = new SparkContext(conf) val child_parent_rdd = sc.textFile("D:\\wc\\danbiaoInput\\*.txt").filter(x=>{if(x.contains("child")) false else true}) .map(x=>{val str=x.replaceA