Spark修炼之道(进阶篇)——Spark入门到精通:第十四节 Spark Streaming 缓存、Checkpoint机制

Spark修炼之道(进阶篇)——Spark入门到精通:第十四节 Spark Streaming 缓存、Checkpoint机制

主要内容
http://blog.csdn.net/lovehuangjiaju/article/details/50102831

本节内容基于官方文档:http://spark.apache.org/docs/latest/streaming-programming-guide.html

  1. Spark Stream 缓存
  2. Checkpoint
  3. 案例

1. Spark Stream 缓存

通过前面一系列的课程介绍,我们知道DStream是由一系列的RDD构成的,它同一般的RDD一样,也可以将流式数据持久化到内容当中,采用的同样是persisit方法,调用该方法后DStream将持久化所有的RDD数据。这对于一些需要重复计算多次或数据需要反复被使用的DStream特别有效。像reduceByWindow、reduceByKeyAndWindow等基于窗口操作的方法,它们默认都是有persisit操作的。reduceByKeyAndWindow方法源码具体如下:

<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">reduceByKeyAndWindow</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(
      reduceFunc: <span class="hljs-params" style="box-sizing: border-box;">(V, V)</span> => V,
      invReduceFunc: <span class="hljs-params" style="box-sizing: border-box;">(V, V)</span> => V,
      windowDuration: Duration,
      slideDuration: Duration,
      partitioner: Partitioner,
      filterFunc: <span class="hljs-params" style="box-sizing: border-box;">(<span class="hljs-params" style="box-sizing: border-box;">(K, V)</span>)</span> => Boolean
    )</span>:</span> DStream[(K, V)] = ssc.withScope {

    val cleanedReduceFunc = ssc.sc.clean(reduceFunc)
    val cleanedInvReduceFunc = ssc.sc.clean(invReduceFunc)
    val cleanedFilterFunc = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (filterFunc != null) Some(ssc.sc.clean(filterFunc)) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">None</span>
    new ReducedWindowedDStream[K, V](
      self, cleanedReduceFunc, cleanedInvReduceFunc, cleanedFilterFunc,
      windowDuration, slideDuration, partitioner
    )
  }</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li></ul>

从上面的方法来看,它最返回的是一个ReducedWindowedDStream对象,跳到该类的源码中可以看到在其主构造函数中包含下面两段代码:

<code class="hljs haskell has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-title" style="box-sizing: border-box;">private</span>[streaming]
<span class="hljs-class" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">class</span> <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">ReducedWindowedDStream</span>[<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">K</span>: <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">ClassTag</span>, <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>: <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">ClassTag</span>]<span class="hljs-container" style="box-sizing: border-box;">(
    <span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">parent</span>: <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">DStream</span>[(<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">K</span>, <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>)</span>],
    reduceFunc: <span class="hljs-container" style="box-sizing: border-box;">(<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>, <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>)</span> => <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>,
    invReduceFunc: <span class="hljs-container" style="box-sizing: border-box;">(<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>, <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>)</span> => <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>,
    filterFunc: <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">Option</span>[<span class="hljs-container" style="box-sizing: border-box;">((<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">K</span>, <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>)</span>) => <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">Boolean</span>],
    _windowDuration: <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">Duration</span>,
    _slideDuration: <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">Duration</span>,
    partitioner: <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">Partitioner</span>
  ) extends <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">DStream</span>[<span class="hljs-container" style="box-sizing: border-box;">(<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">K</span>, <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">V</span>)</span>]<span class="hljs-container" style="box-sizing: border-box;">(<span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">parent</span>.<span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">ssc</span>)</span> {
  //省略其它非关键代码

  //默认被缓存到内存当中
  // <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">Persist</span> <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">RDDs</span> to memory by default as these <span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">RDDs</span> are going to be reused.
  super.persist<span class="hljs-container" style="box-sizing: border-box;">(<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">StorageLevel</span>.<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">MEMORY_ONLY_SER</span>)</span>
  reducedStream.persist<span class="hljs-container" style="box-sizing: border-box;">(<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">StorageLevel</span>.<span class="hljs-type" style="box-sizing: border-box; color: rgb(102, 0, 102);">MEMORY_ONLY_SER</span>)</span>
}</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li></ul>

通过上面的代码我们可以看到,通过窗口操作产生的DStream不需要开发人员手动去调用persist方法,Spark会自动帮我们将数据缓存当内存当中。同一般的RDD类似,DStream支持的persisit级别为: 
这里写图片描述

2. Checkpoint机制

通过前期对Spark Streaming的理解,我们知道,Spark Streaming应用程序如果不手动停止,则将一直运行下去,在实际中应用程序一般是24小时*7天不间断运行的,因此Streaming必须对诸如系统错误、JVM出错等与程序逻辑无关的错误(failures )具体很强的弹性,具备一定的非应用程序出错的容错性。Spark Streaming的Checkpoint机制便是为此设计的,它将足够多的信息checkpoint到某些具备容错性的存储系统如HDFS上,以便出错时能够迅速恢复。有两种数据可以chekpoint:

(1)Metadata checkpointing 
将流式计算的信息保存到具备容错性的存储上如HDFS,Metadata Checkpointing适用于当streaming应用程序Driver所在的节点出错时能够恢复,元数据包括: 
Configuration(配置信息) - 创建streaming应用程序的配置信息 
DStream operations - 在streaming应用程序中定义的DStreaming操作 
Incomplete batches - 在列队中没有处理完的作业

(2)Data checkpointing 
将生成的RDD保存到外部可靠的存储当中,对于一些数据跨度为多个bactch的有状态tranformation操作来说,checkpoint非常有必要,因为在这些transformation操作生成的RDD对前一RDD有依赖,随着时间的增加,依赖链可能会非常长,checkpoint机制能够切断依赖链,将中间的RDD周期性地checkpoint到可靠存储当中,从而在出错时可以直接从checkpoint点恢复。

具体来说,metadata checkpointing主要还是从drvier失败中恢复,而Data Checkpoing用于对有状态的transformation操作进行checkpointing

Checkpointing具体的使用方式时通过下列方法:

<code class="hljs cs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//checkpointDirectory为checkpoint文件保存目录</span>
streamingContext.checkpoint(checkpointDirectory)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>

3. 案例

程序来源:https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/RecoverableNetworkWordCount.scala
进行了适量修改

<code class="hljs scala has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> java.io.File
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> java.nio.charset.Charset

<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> com.google.common.io.Files

<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> org.apache.spark.SparkConf
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> org.apache.spark.rdd.RDD
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> org.apache.spark.streaming.{Time, Seconds, StreamingContext}
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> org.apache.spark.util.IntParam

<span class="hljs-javadoc" style="color: rgb(136, 0, 0); box-sizing: border-box;">/**
 * Counts words in text encoded with UTF8 received from the network every second.
 *
 * Usage: RecoverableNetworkWordCount <hostname> <port> <checkpoint-directory> <output-file>
 *   <hostname> and <port> describe the TCP server that Spark Streaming would connect to receive
 *   data. <checkpoint-directory> directory to HDFS-compatible file system which checkpoint data
 *   <output-file> file to which the word counts will be appended
 *
 * <checkpoint-directory> and <output-file> must be absolute paths
 *
 * To run this on your local machine, you need to first run a Netcat server
 *
 *      `$ nc -lk 9999`
 *
 * and run the example as
 *
 *      `$ ./bin/run-example org.apache.spark.examples.streaming.RecoverableNetworkWordCount \
 *              localhost 9999 ~/checkpoint/ ~/out`
 *
 * If the directory ~/checkpoint/ does not exist (e.g. running for the first time), it will create
 * a new StreamingContext (will print "Creating new context" to the console). Otherwise, if
 * checkpoint data exists in ~/checkpoint/, then it will create StreamingContext from
 * the checkpoint data.
 *
 * Refer to the online documentation for more details.
 */</span>
<span class="hljs-class" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">object</span> <span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">RecoverableNetworkWordCount</span> {</span>

  <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> createContext(ip: String, port: Int, outputPath: String, checkpointDirectory: String)
    : StreamingContext = {


    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//程序第一运行时会创建该条语句,如果应用程序失败,则会从checkpoint中恢复,该条语句不会执行</span>
    println(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Creating new context"</span>)
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">val</span> outputFile = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">new</span> File(outputPath)
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (outputFile.exists()) outputFile.delete()
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">val</span> sparkConf = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">new</span> SparkConf().setAppName(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"RecoverableNetworkWordCount"</span>).setMaster(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"local[4]"</span>)
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">// Create the context with a 1 second batch size</span>
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">val</span> ssc = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">new</span> StreamingContext(sparkConf, Seconds(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))
    ssc.checkpoint(checkpointDirectory)

    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//将socket作为数据源</span>
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">val</span> lines = ssc.socketTextStream(ip, port)
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">val</span> words = lines.flatMap(_.split(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" "</span>))
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">val</span> wordCounts = words.map(x => (x, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)).reduceByKey(_ + _)
    wordCounts.foreachRDD((rdd: RDD[(String, Int)], time: Time) => {
      <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">val</span> counts = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Counts at time "</span> + time + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" "</span> + rdd.collect().mkString(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"["</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">", "</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"]"</span>)
      println(counts)
      println(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Appending to "</span> + outputFile.getAbsolutePath)
      Files.append(counts + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"\n"</span>, outputFile, Charset.defaultCharset())
    })
    ssc
  }
  <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//将String转换成Int</span>
  <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">private</span> <span class="hljs-class" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">object</span> <span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">IntParam</span> {</span>
  <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> unapply(str: String): Option[Int] = {
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">try</span> {
      Some(str.toInt)
    } <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">catch</span> {
      <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">case</span> e: NumberFormatException => None
    }
  }
}
  <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> main(args: Array[String]) {
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (args.length != <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4</span>) {
      System.err.println(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"You arguments were "</span> + args.mkString(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"["</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">", "</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"]"</span>))
      System.err.println(
        <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"""
          |Usage: RecoverableNetworkWordCount <hostname> <port> <checkpoint-directory>
          |     <output-file>. <hostname> and <port> describe the TCP server that Spark
          |     Streaming would connect to receive data. <checkpoint-directory> directory to
          |     HDFS-compatible file system which checkpoint data <output-file> file to which the
          |     word counts will be appended
          |
          |In local mode, <master> should be 'local[n]' with n > 1
          |Both <checkpoint-directory> and <output-file> must be absolute paths
        """</span>.stripMargin
      )
      System.exit(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)
    }
   <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">val</span> Array(ip, IntParam(port), checkpointDirectory, outputPath) = args
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//getOrCreate方法,从checkpoint中重新创建StreamingContext对象或新创建一个StreamingContext对象</span>
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">val</span> ssc = StreamingContext.getOrCreate(checkpointDirectory,
      () => {
        createContext(ip, port, outputPath, checkpointDirectory)
      })
    ssc.start()
    ssc.awaitTermination()
  }
}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li><li style="box-sizing: border-box; padding: 0px 5px;">59</li><li style="box-sizing: border-box; padding: 0px 5px;">60</li><li style="box-sizing: border-box; padding: 0px 5px;">61</li><li style="box-sizing: border-box; padding: 0px 5px;">62</li><li style="box-sizing: border-box; padding: 0px 5px;">63</li><li style="box-sizing: border-box; padding: 0px 5px;">64</li><li style="box-sizing: border-box; padding: 0px 5px;">65</li><li style="box-sizing: border-box; padding: 0px 5px;">66</li><li style="box-sizing: border-box; padding: 0px 5px;">67</li><li style="box-sizing: border-box; padding: 0px 5px;">68</li><li style="box-sizing: border-box; padding: 0px 5px;">69</li><li style="box-sizing: border-box; padding: 0px 5px;">70</li><li style="box-sizing: border-box; padding: 0px 5px;">71</li><li style="box-sizing: border-box; padding: 0px 5px;">72</li><li style="box-sizing: border-box; padding: 0px 5px;">73</li><li style="box-sizing: border-box; padding: 0px 5px;">74</li><li style="box-sizing: border-box; padding: 0px 5px;">75</li><li style="box-sizing: border-box; padding: 0px 5px;">76</li><li style="box-sizing: border-box; padding: 0px 5px;">77</li><li style="box-sizing: border-box; padding: 0px 5px;">78</li><li style="box-sizing: border-box; padding: 0px 5px;">79</li><li style="box-sizing: border-box; padding: 0px 5px;">80</li><li style="box-sizing: border-box; padding: 0px 5px;">81</li><li style="box-sizing: border-box; padding: 0px 5px;">82</li><li style="box-sizing: border-box; padding: 0px 5px;">83</li><li style="box-sizing: border-box; padding: 0px 5px;">84</li><li style="box-sizing: border-box; padding: 0px 5px;">85</li><li style="box-sizing: border-box; padding: 0px 5px;">86</li><li style="box-sizing: border-box; padding: 0px 5px;">87</li><li style="box-sizing: border-box; padding: 0px 5px;">88</li><li style="box-sizing: border-box; padding: 0px 5px;">89</li><li style="box-sizing: border-box; padding: 0px 5px;">90</li><li style="box-sizing: border-box; padding: 0px 5px;">91</li><li style="box-sizing: border-box; padding: 0px 5px;">92</li><li style="box-sizing: border-box; padding: 0px 5px;">93</li><li style="box-sizing: border-box; padding: 0px 5px;">94</li><li style="box-sizing: border-box; padding: 0px 5px;">95</li><li style="box-sizing: border-box; padding: 0px 5px;">96</li><li style="box-sizing: border-box; padding: 0px 5px;">97</li><li style="box-sizing: border-box; padding: 0px 5px;">98</li><li style="box-sizing: border-box; padding: 0px 5px;">99</li><li style="box-sizing: border-box; padding: 0px 5px;">100</li></ul>

输入参数配置如下: 
这里写图片描述

运行状态图如下: 
这里写图片描述

首次运行时:

<code class="hljs vhdl has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">//创建新的StreamingContext
Creating <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">new</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">context</span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">11</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">30</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">07</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">32</span> WARN MetricsSystem: Using <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">default</span> name DAGScheduler <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> source because spark.app.id <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">not</span> set.
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">11</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">30</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">07</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">33</span> WARN SizeEstimator: Failed <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> check whether UseCompressedOops <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> set; assuming yes
Counts at <span class="hljs-typename" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1448896840000</span> ms []
Appending <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> /root/out2
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">11</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">30</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">07</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">47</span> WARN BlockManager: <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">Block</span> input-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1448896847000</span> replicated <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> only <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> peer(s) instead <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">of</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> peers
Counts at <span class="hljs-typename" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1448896850000</span> ms [(Spark,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>), (<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">Context</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)]</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>

手动将程序停止,然后重新运行

<code class="hljs vhdl has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">//这时从checkpoint目录中读取元数据信息,进行StreamingContext的恢复
Counts at <span class="hljs-typename" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1448897070000</span> ms []
Appending <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> /root/out2
Counts at <span class="hljs-typename" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1448897080000</span> ms []
Appending <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> /root/out2
Counts at <span class="hljs-typename" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1448897090000</span> ms []
Appending <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> /root/out2
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">11</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">30</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">07</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">24</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">58</span> WARN BlockManager: <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">Block</span> input-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1448897098600</span> replicated <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> only <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> peer(s) instead <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">of</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> peers
[Stage <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">8</span>:>                                                          (<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> + <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>) / <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4</span>]Counts at <span class="hljs-typename" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1448897100000</span> ms [(Spark,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>), (<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">Context</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)]
Appending <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">to</span> /root/out2</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li></ul>
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值