Kafka 0.10的 Structured Streaming 集成,可从Kafka读取数据或向Kafka写入数据。
-
从Kafka读取数据
def run(): Unit ={
val df = spark.readStream
.format("kafka")
.option("kafka.bootstrap.servers", "host1:port1,host2:port2")
.option("subscribe", "topic1")
.load()
df.selectExpr("CAST(key as STRING)", "CAST(value AS STRING)").as[(String, String)]
val df = spark
.readStream
.format("kafka")
.option("kafka.bootstap.servers", "host1:port1,host2:port2")
.option("subscribe", "topic1,topic2")
.load()
df.selectExpr(