Socket数据实时计算:
准备工作
nc -lk 9999
hadoop spark sqoop hadoop spark hive hadoop
代码演示 :
def main(args: Array[String]): Unit = {
//1 创建sparksession
val spark: SparkSession = SparkSession.builder()
.master(“local[*]”)
.appName(“StructStreaming_socket”)
.getOrCreate()
val sc: SparkContext = spark.sparkContext
sc.setLogLevel(“WARN”)
//2 读物实时数据 数据类型是Row
val socketDatasRow: DataFrame = spark.readStream.option(“host”,“hadoop01”).
option(“port”,“9999”)
.format(“socket”)
.load()
//3 对数据进行处理和计算
import spark.implicits._
val socketDatasString: Dataset[String] = socketDatasRow.as[String]
val Word: Dataset[String] = socketDatasString.flatMap(a=>{a.split(" ")})