spark数据流(data flow)的合并可以通过union来实现。
先测试一下批量数据(batching data)的union:
scala> Seq("1","2","3","4").toDS.union(Seq("a","b","c","d").toDS).show
+-----+
|value|
+-----+
| 1|
| 2|
| 3|
| 4|
| a|
| b|
| c|
| d|
+-----+
再来测试一下流数据(streaming data)的union:
val lines1 = spark.readStream.format("socket").option("host", "localhost")