Structured Streaming之outputMode(complete和append)区别说明

1.complete需要聚合,并将原先批次的数据和本次批次的数据一起聚合,而append是不能聚合的

2.若用append替换complete代码演示:

def main(args: Array[String]): Unit = {
        val spark = SparkSession.builder().master("local[1]").getOrCreate()
        import spark.implicits._
        val wordCounts  = spark.readStream.text("D:\\tmp\\streaming\\struct")
            .as[String].flatMap(_.split(" "))
            .groupBy("value").count()


        val query = wordCounts.writeStream
                .foreach(new TestForeachWriter())
            .outputMode("complete")//complete  append
            .trigger(ProcessingTime("10 seconds"))
            .start()
        query.awaitTermination()

    }
//正常运行,若将complete改为append,将报以下错误
org.apache.spark.sql.AnalysisException: Append output mode not supported when there are streaming aggregations on streaming DataFrames/DataSets;;

3.若用complete替换append代码演示:

def main(args: Array[String]): Unit = {
        val spark = SparkSession.builder().master("local[1]").getOrCreate()
        import spark.implicits._
        val wordCounts  = spark.readStream.text("D:\\tmp\\streaming\\struct")
                    .as[String].flatMap(_.split(" ")).map(T1(_,1)).toDF()

        val query = wordCounts.writeStream
                .foreach(new TestForeachWriter())
            .outputMode("append")//complete  append
            .trigger(ProcessingTime("10 seconds"))
            .start()
        query.awaitTermination()

    }
    case class T1(value:String,num:Int)
//若用complete替换append,将报以下错误
org.apache.spark.sql.AnalysisException: Complete output mode not supported when there are no streaming aggregations on streaming DataFrames/Datasets;;

4. 源码:

/**
   * Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.
   *   - `append`:   only the new rows in the streaming DataFrame/Dataset will be written to
   *                 the sink
   *   - `complete`: all the rows in the streaming DataFrame/Dataset will be written to the sink
   *                 every time these is some updates
   *
   * @since 2.0.0
   */
  def outputMode(outputMode: String): DataStreamWriter[T] = {
    this.outputMode = outputMode.toLowerCase match {
      case "append" =>
        OutputMode.Append
      case "complete" =>
        OutputMode.Complete
      case _ =>
        throw new IllegalArgumentException(s"Unknown output mode $outputMode. " +
          "Accepted output modes are 'append' and 'complete'")
    }
    this
  }

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值