Spark代码2之Transformation:union,distinct,join

更多代码请见:https://github.com/xubo245/SparkLearning


Spark代码2之Transformation:union,distinct,join

代码:

package LocalSpark

/**
  * Created by xubo on 2016/3/3.
  */
import org.apache.spark._
import org.apache.spark.network.netty.SparkTransportConf

object Transformation1 {
  def main(args:Array[String]): Unit ={
    val conf =new SparkConf().setAppName("Transformation1").setMaster("local")
    val spark=new SparkContext(conf)
    //union
    var a1=spark.parallelize(List(('a',1),('b',1)))
    var a2=spark.parallelize(List(('c',1),('d',1)))
     val result= a1.union(a2)
    println(result.count())
    println(result.collect().length)
    for(i<-0 until result.collect().length) (result.collect())(i)
    for((i,j)<- result.collect()) println(i+":"+j)

    //distinct
    var a3=spark.parallelize(List(('a',1),('b',1),('a',1)))
    var a4=spark.parallelize(List(('c',1),('d',1),('b',1),('b',2),('b',3),('a',1),('a',2)))
    val r2=a3.distinct()
    for((i,j)<-a3) println("a3:"+i+":"+j)
    for((i,j)<-r2) println("r2:"+i+":"+j)

    //
    var j1=a3.join(a4)
    for((i,(j,k))<-j1) println("j1:"+i+":"+j+":"+k)
  }
}

运行结果:

D:\1win7\java\jdk\bin\java -Didea.launcher.port=7533 "-Didea.launcher.bin.path=D:\1win7\idea\IntelliJ IDEA Community Edition 15.0.4\bin" -Dfile.encoding=UTF-8 -classpath "D:\1win7\java\jdk\jre\lib\charsets.jar;D:\1win7\java\jdk\jre\lib\deploy.jar;D:\1win7\java\jdk\jre\lib\ext\access-bridge-64.jar;D:\1win7\java\jdk\jre\lib\ext\dnsns.jar;D:\1win7\java\jdk\jre\lib\ext\jaccess.jar;D:\1win7\java\jdk\jre\lib\ext\localedata.jar;D:\1win7\java\jdk\jre\lib\ext\sunec.jar;D:\1win7\java\jdk\jre\lib\ext\sunjce_provider.jar;D:\1win7\java\jdk\jre\lib\ext\sunmscapi.jar;D:\1win7\java\jdk\jre\lib\ext\zipfs.jar;D:\1win7\java\jdk\jre\lib\javaws.jar;D:\1win7\java\jdk\jre\lib\jce.jar;D:\1win7\java\jdk\jre\lib\jfr.jar;D:\1win7\java\jdk\jre\lib\jfxrt.jar;D:\1win7\java\jdk\jre\lib\jsse.jar;D:\1win7\java\jdk\jre\lib\management-agent.jar;D:\1win7\java\jdk\jre\lib\plugin.jar;D:\1win7\java\jdk\jre\lib\resources.jar;D:\1win7\java\jdk\jre\lib\rt.jar;D:\1win7\scala;D:\1win7\scala\lib;D:\all\idea\scala2\out\production\scala2;G:\149\spark-assembly-1.5.2-hadoop2.6.0.jar;D:\1win7\scala\lib\scala-actors-migration.jar;D:\1win7\scala\lib\scala-actors.jar;D:\1win7\scala\lib\scala-library.jar;D:\1win7\scala\lib\scala-reflect.jar;D:\1win7\scala\lib\scala-swing.jar;D:\1win7\idea\IntelliJ IDEA Community Edition 15.0.4\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain LocalSpark.Transformation1
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/03/03 22:19:55 INFO SparkContext: Running Spark version 1.5.2
16/03/03 22:19:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/03 22:19:56 INFO SecurityManager: Changing view acls to: xubo
16/03/03 22:19:56 INFO SecurityManager: Changing modify acls to: xubo
16/03/03 22:19:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(xubo); users with modify permissions: Set(xubo)
16/03/03 22:19:57 INFO Slf4jLogger: Slf4jLogger started
16/03/03 22:19:57 INFO Remoting: Starting remoting
16/03/03 22:19:57 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@202.38.84.241:50601]
16/03/03 22:19:57 INFO Utils: Successfully started service 'sparkDriver' on port 50601.
16/03/03 22:19:57 INFO SparkEnv: Registering MapOutputTracker
16/03/03 22:19:57 INFO SparkEnv: Registering BlockManagerMaster
16/03/03 22:19:57 INFO DiskBlockManager: Created local directory at C:\Users\xubo\AppData\Local\Temp\blockmgr-d0b75ef9-8481-448f-8f00-149fde368635
16/03/03 22:19:57 INFO MemoryStore: MemoryStore started with capacity 730.6 MB
16/03/03 22:19:57 INFO HttpFileServer: HTTP File server directory is C:\Users\xubo\AppData\Local\Temp\spark-c2b62943-1802-4092-a678-704b95d428ca\httpd-38a16729-4063-4ff9-a3e9-336e27bd386f
16/03/03 22:19:57 INFO HttpServer: Starting HTTP Server
16/03/03 22:19:57 INFO Utils: Successfully started service 'HTTP file server' on port 50602.
16/03/03 22:19:57 INFO SparkEnv: Registering OutputCommitCoordinator
16/03/03 22:19:57 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/03/03 22:19:57 INFO SparkUI: Started SparkUI at http://202.38.84.241:4040
16/03/03 22:19:58 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
16/03/03 22:19:58 INFO Executor: Starting executor ID driver on host localhost
16/03/03 22:19:58 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 50609.
16/03/03 22:19:58 INFO NettyBlockTransferService: Server created on 50609
16/03/03 22:19:58 INFO BlockManagerMaster: Trying to register BlockManager
16/03/03 22:19:58 INFO BlockManagerMasterEndpoint: Registering block manager localhost:50609 with 730.6 MB RAM, BlockManagerId(driver, localhost, 50609)
16/03/03 22:19:58 INFO BlockManagerMaster: Registered BlockManager
16/03/03 22:19:58 INFO SparkContext: Starting job: count at Transformation1.scala:17
16/03/03 22:19:58 INFO DAGScheduler: Got job 0 (count at Transformation1.scala:17) with 2 output partitions
16/03/03 22:19:58 INFO DAGScheduler: Final stage: ResultStage 0(count at Transformation1.scala:17)
16/03/03 22:19:58 INFO DAGScheduler: Parents of final stage: List()
16/03/03 22:19:58 INFO DAGScheduler: Missing parents: List()
16/03/03 22:19:58 INFO DAGScheduler: Submitting ResultStage 0 (UnionRDD[2] at union at Transformation1.scala:16), which has no missing parents
16/03/03 22:19:59 INFO MemoryStore: ensureFreeSpace(1784) called with curMem=0, maxMem=766075207
16/03/03 22:19:59 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1784.0 B, free 730.6 MB)
16/03/03 22:19:59 INFO MemoryStore: ensureFreeSpace(1202) called with curMem=1784, maxMem=766075207
16/03/03 22:19:59 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1202.0 B, free 730.6 MB)
16/03/03 22:19:59 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:50609 (size: 1202.0 B, free: 730.6 MB)
16/03/03 22:19:59 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:861
16/03/03 22:19:59 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (UnionRDD[2] at union at Transformation1.scala:16)
16/03/03 22:19:59 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
16/03/03 22:19:59 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:19:59 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
16/03/03 22:19:59 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 953 bytes result sent to driver
16/03/03 22:19:59 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:19:59 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
16/03/03 22:19:59 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 953 bytes result sent to driver
16/03/03 22:19:59 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 89 ms on localhost (1/2)
16/03/03 22:19:59 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 17 ms on localhost (2/2)
16/03/03 22:19:59 INFO DAGScheduler: ResultStage 0 (count at Transformation1.scala:17) finished in 0.110 s
16/03/03 22:19:59 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
16/03/03 22:19:59 INFO DAGScheduler: Job 0 finished: count at Transformation1.scala:17, took 0.619731 s
4
16/03/03 22:19:59 INFO SparkContext: Starting job: collect at Transformation1.scala:18
16/03/03 22:19:59 INFO DAGScheduler: Got job 1 (collect at Transformation1.scala:18) with 2 output partitions
16/03/03 22:19:59 INFO DAGScheduler: Final stage: ResultStage 1(collect at Transformation1.scala:18)
16/03/03 22:19:59 INFO DAGScheduler: Parents of final stage: List()
16/03/03 22:19:59 INFO DAGScheduler: Missing parents: List()
16/03/03 22:19:59 INFO DAGScheduler: Submitting ResultStage 1 (UnionRDD[2] at union at Transformation1.scala:16), which has no missing parents
16/03/03 22:19:59 INFO MemoryStore: ensureFreeSpace(1928) called with curMem=2986, maxMem=766075207
16/03/03 22:19:59 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 1928.0 B, free 730.6 MB)
16/03/03 22:19:59 INFO MemoryStore: ensureFreeSpace(1246) called with curMem=4914, maxMem=766075207
16/03/03 22:19:59 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1246.0 B, free 730.6 MB)
16/03/03 22:19:59 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:50609 (size: 1246.0 B, free: 730.6 MB)
16/03/03 22:19:59 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:861
16/03/03 22:19:59 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (UnionRDD[2] at union at Transformation1.scala:16)
16/03/03 22:19:59 INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks
16/03/03 22:19:59 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:19:59 INFO Executor: Running task 0.0 in stage 1.0 (TID 2)
16/03/03 22:19:59 INFO Executor: Finished task 0.0 in stage 1.0 (TID 2). 1057 bytes result sent to driver
16/03/03 22:19:59 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:19:59 INFO Executor: Running task 1.0 in stage 1.0 (TID 3)
16/03/03 22:19:59 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 15 ms on localhost (1/2)
16/03/03 22:19:59 INFO Executor: Finished task 1.0 in stage 1.0 (TID 3). 1057 bytes result sent to driver
16/03/03 22:19:59 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 13 ms on localhost (2/2)
16/03/03 22:19:59 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
16/03/03 22:19:59 INFO DAGScheduler: ResultStage 1 (collect at Transformation1.scala:18) finished in 0.030 s
16/03/03 22:19:59 INFO DAGScheduler: Job 1 finished: collect at Transformation1.scala:18, took 0.047917 s
4
16/03/03 22:19:59 INFO SparkContext: Starting job: collect at Transformation1.scala:19
16/03/03 22:19:59 INFO DAGScheduler: Got job 2 (collect at Transformation1.scala:19) with 2 output partitions
16/03/03 22:19:59 INFO DAGScheduler: Final stage: ResultStage 2(collect at Transformation1.scala:19)
16/03/03 22:19:59 INFO DAGScheduler: Parents of final stage: List()
16/03/03 22:19:59 INFO DAGScheduler: Missing parents: List()
16/03/03 22:19:59 INFO DAGScheduler: Submitting ResultStage 2 (UnionRDD[2] at union at Transformation1.scala:16), which has no missing parents
16/03/03 22:19:59 INFO MemoryStore: ensureFreeSpace(1928) called with curMem=6160, maxMem=766075207
16/03/03 22:19:59 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 1928.0 B, free 730.6 MB)
16/03/03 22:19:59 INFO MemoryStore: ensureFreeSpace(1246) called with curMem=8088, maxMem=766075207
16/03/03 22:19:59 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1246.0 B, free 730.6 MB)
16/03/03 22:19:59 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:50609 (size: 1246.0 B, free: 730.6 MB)
16/03/03 22:19:59 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:861
16/03/03 22:19:59 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 2 (UnionRDD[2] at union at Transformation1.scala:16)
16/03/03 22:19:59 INFO TaskSchedulerImpl: Adding task set 2.0 with 2 tasks
16/03/03 22:19:59 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 4, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:19:59 INFO Executor: Running task 0.0 in stage 2.0 (TID 4)
16/03/03 22:19:59 INFO Executor: Finished task 0.0 in stage 2.0 (TID 4). 1057 bytes result sent to driver
16/03/03 22:19:59 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 5, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:19:59 INFO Executor: Running task 1.0 in stage 2.0 (TID 5)
16/03/03 22:19:59 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 4) in 15 ms on localhost (1/2)
16/03/03 22:19:59 INFO Executor: Finished task 1.0 in stage 2.0 (TID 5). 1057 bytes result sent to driver
16/03/03 22:19:59 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 5) in 11 ms on localhost (2/2)
16/03/03 22:19:59 INFO DAGScheduler: ResultStage 2 (collect at Transformation1.scala:19) finished in 0.022 s
16/03/03 22:19:59 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 
16/03/03 22:19:59 INFO DAGScheduler: Job 2 finished: collect at Transformation1.scala:19, took 0.038511 s
16/03/03 22:19:59 INFO SparkContext: Starting job: collect at Transformation1.scala:19
16/03/03 22:19:59 INFO DAGScheduler: Got job 3 (collect at Transformation1.scala:19) with 2 output partitions
16/03/03 22:19:59 INFO DAGScheduler: Final stage: ResultStage 3(collect at Transformation1.scala:19)
16/03/03 22:19:59 INFO DAGScheduler: Parents of final stage: List()
16/03/03 22:19:59 INFO DAGScheduler: Missing parents: List()
16/03/03 22:19:59 INFO DAGScheduler: Submitting ResultStage 3 (UnionRDD[2] at union at Transformation1.scala:16), which has no missing parents
16/03/03 22:19:59 INFO MemoryStore: ensureFreeSpace(1928) called with curMem=9334, maxMem=766075207
16/03/03 22:19:59 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 1928.0 B, free 730.6 MB)
16/03/03 22:19:59 INFO MemoryStore: ensureFreeSpace(1246) called with curMem=11262, maxMem=766075207
16/03/03 22:19:59 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 1246.0 B, free 730.6 MB)
16/03/03 22:19:59 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on localhost:50609 (size: 1246.0 B, free: 730.6 MB)
16/03/03 22:19:59 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:861
16/03/03 22:19:59 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 3 (UnionRDD[2] at union at Transformation1.scala:16)
16/03/03 22:19:59 INFO TaskSchedulerImpl: Adding task set 3.0 with 2 tasks
16/03/03 22:19:59 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 6, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:19:59 INFO Executor: Running task 0.0 in stage 3.0 (TID 6)
16/03/03 22:19:59 INFO Executor: Finished task 0.0 in stage 3.0 (TID 6). 1057 bytes result sent to driver
16/03/03 22:19:59 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 7, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:19:59 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 6) in 17 ms on localhost (1/2)
16/03/03 22:19:59 INFO Executor: Running task 1.0 in stage 3.0 (TID 7)
16/03/03 22:19:59 INFO Executor: Finished task 1.0 in stage 3.0 (TID 7). 1057 bytes result sent to driver
16/03/03 22:19:59 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 7) in 15 ms on localhost (2/2)
16/03/03 22:19:59 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool 
16/03/03 22:19:59 INFO DAGScheduler: ResultStage 3 (collect at Transformation1.scala:19) finished in 0.026 s
16/03/03 22:19:59 INFO DAGScheduler: Job 3 finished: collect at Transformation1.scala:19, took 0.126468 s
16/03/03 22:19:59 INFO SparkContext: Starting job: collect at Transformation1.scala:19
16/03/03 22:19:59 INFO DAGScheduler: Got job 4 (collect at Transformation1.scala:19) with 2 output partitions
16/03/03 22:19:59 INFO DAGScheduler: Final stage: ResultStage 4(collect at Transformation1.scala:19)
16/03/03 22:19:59 INFO DAGScheduler: Parents of final stage: List()
16/03/03 22:19:59 INFO DAGScheduler: Missing parents: List()
16/03/03 22:19:59 INFO DAGScheduler: Submitting ResultStage 4 (UnionRDD[2] at union at Transformation1.scala:16), which has no missing parents
16/03/03 22:19:59 INFO MemoryStore: ensureFreeSpace(1928) called with curMem=12508, maxMem=766075207
16/03/03 22:19:59 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 1928.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1246) called with curMem=14436, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 1246.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on localhost:50609 (size: 1246.0 B, free: 730.6 MB)
16/03/03 22:20:00 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:00 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 4 (UnionRDD[2] at union at Transformation1.scala:16)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Adding task set 4.0 with 2 tasks
16/03/03 22:20:00 INFO TaskSetManager: Starting task 0.0 in stage 4.0 (TID 8, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:20:00 INFO Executor: Running task 0.0 in stage 4.0 (TID 8)
16/03/03 22:20:00 INFO Executor: Finished task 0.0 in stage 4.0 (TID 8). 1057 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Starting task 1.0 in stage 4.0 (TID 9, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:20:00 INFO Executor: Running task 1.0 in stage 4.0 (TID 9)
16/03/03 22:20:00 INFO TaskSetManager: Finished task 0.0 in stage 4.0 (TID 8) in 13 ms on localhost (1/2)
16/03/03 22:20:00 INFO Executor: Finished task 1.0 in stage 4.0 (TID 9). 1057 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Finished task 1.0 in stage 4.0 (TID 9) in 7 ms on localhost (2/2)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Removed TaskSet 4.0, whose tasks have all completed, from pool 
16/03/03 22:20:00 INFO DAGScheduler: ResultStage 4 (collect at Transformation1.scala:19) finished in 0.020 s
16/03/03 22:20:00 INFO DAGScheduler: Job 4 finished: collect at Transformation1.scala:19, took 0.193107 s
16/03/03 22:20:00 INFO SparkContext: Starting job: collect at Transformation1.scala:19
16/03/03 22:20:00 INFO DAGScheduler: Got job 5 (collect at Transformation1.scala:19) with 2 output partitions
16/03/03 22:20:00 INFO DAGScheduler: Final stage: ResultStage 5(collect at Transformation1.scala:19)
16/03/03 22:20:00 INFO DAGScheduler: Parents of final stage: List()
16/03/03 22:20:00 INFO DAGScheduler: Missing parents: List()
16/03/03 22:20:00 INFO DAGScheduler: Submitting ResultStage 5 (UnionRDD[2] at union at Transformation1.scala:16), which has no missing parents
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1928) called with curMem=15682, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 1928.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1246) called with curMem=17610, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 1246.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on localhost:50609 (size: 1246.0 B, free: 730.6 MB)
16/03/03 22:20:00 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:00 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 5 (UnionRDD[2] at union at Transformation1.scala:16)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Adding task set 5.0 with 2 tasks
16/03/03 22:20:00 INFO TaskSetManager: Starting task 0.0 in stage 5.0 (TID 10, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:20:00 INFO Executor: Running task 0.0 in stage 5.0 (TID 10)
16/03/03 22:20:00 INFO Executor: Finished task 0.0 in stage 5.0 (TID 10). 1057 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Starting task 1.0 in stage 5.0 (TID 11, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:20:00 INFO Executor: Running task 1.0 in stage 5.0 (TID 11)
16/03/03 22:20:00 INFO TaskSetManager: Finished task 0.0 in stage 5.0 (TID 10) in 17 ms on localhost (1/2)
16/03/03 22:20:00 INFO Executor: Finished task 1.0 in stage 5.0 (TID 11). 1057 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Finished task 1.0 in stage 5.0 (TID 11) in 15 ms on localhost (2/2)
16/03/03 22:20:00 INFO DAGScheduler: ResultStage 5 (collect at Transformation1.scala:19) finished in 0.026 s
16/03/03 22:20:00 INFO TaskSchedulerImpl: Removed TaskSet 5.0, whose tasks have all completed, from pool 
16/03/03 22:20:00 INFO DAGScheduler: Job 5 finished: collect at Transformation1.scala:19, took 0.107037 s
16/03/03 22:20:00 INFO SparkContext: Starting job: collect at Transformation1.scala:19
16/03/03 22:20:00 INFO DAGScheduler: Got job 6 (collect at Transformation1.scala:19) with 2 output partitions
16/03/03 22:20:00 INFO DAGScheduler: Final stage: ResultStage 6(collect at Transformation1.scala:19)
16/03/03 22:20:00 INFO DAGScheduler: Parents of final stage: List()
16/03/03 22:20:00 INFO DAGScheduler: Missing parents: List()
16/03/03 22:20:00 INFO DAGScheduler: Submitting ResultStage 6 (UnionRDD[2] at union at Transformation1.scala:16), which has no missing parents
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1928) called with curMem=18856, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_6 stored as values in memory (estimated size 1928.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1246) called with curMem=20784, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 1246.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on localhost:50609 (size: 1246.0 B, free: 730.6 MB)
16/03/03 22:20:00 INFO SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:00 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 6 (UnionRDD[2] at union at Transformation1.scala:16)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Adding task set 6.0 with 2 tasks
16/03/03 22:20:00 INFO TaskSetManager: Starting task 0.0 in stage 6.0 (TID 12, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:20:00 INFO Executor: Running task 0.0 in stage 6.0 (TID 12)
16/03/03 22:20:00 INFO Executor: Finished task 0.0 in stage 6.0 (TID 12). 1057 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Starting task 1.0 in stage 6.0 (TID 13, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:20:00 INFO Executor: Running task 1.0 in stage 6.0 (TID 13)
16/03/03 22:20:00 INFO TaskSetManager: Finished task 0.0 in stage 6.0 (TID 12) in 10 ms on localhost (1/2)
16/03/03 22:20:00 INFO Executor: Finished task 1.0 in stage 6.0 (TID 13). 1057 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Finished task 1.0 in stage 6.0 (TID 13) in 8 ms on localhost (2/2)
16/03/03 22:20:00 INFO DAGScheduler: ResultStage 6 (collect at Transformation1.scala:19) finished in 0.016 s
16/03/03 22:20:00 INFO TaskSchedulerImpl: Removed TaskSet 6.0, whose tasks have all completed, from pool 
16/03/03 22:20:00 INFO DAGScheduler: Job 6 finished: collect at Transformation1.scala:19, took 0.137714 s
16/03/03 22:20:00 INFO SparkContext: Starting job: collect at Transformation1.scala:20
16/03/03 22:20:00 INFO DAGScheduler: Got job 7 (collect at Transformation1.scala:20) with 2 output partitions
16/03/03 22:20:00 INFO DAGScheduler: Final stage: ResultStage 7(collect at Transformation1.scala:20)
16/03/03 22:20:00 INFO DAGScheduler: Parents of final stage: List()
16/03/03 22:20:00 INFO DAGScheduler: Missing parents: List()
16/03/03 22:20:00 INFO DAGScheduler: Submitting ResultStage 7 (UnionRDD[2] at union at Transformation1.scala:16), which has no missing parents
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1928) called with curMem=22030, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_7 stored as values in memory (estimated size 1928.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1246) called with curMem=23958, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 1246.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO BlockManagerInfo: Added broadcast_7_piece0 in memory on localhost:50609 (size: 1246.0 B, free: 730.6 MB)
16/03/03 22:20:00 INFO SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:00 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 7 (UnionRDD[2] at union at Transformation1.scala:16)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Adding task set 7.0 with 2 tasks
16/03/03 22:20:00 INFO TaskSetManager: Starting task 0.0 in stage 7.0 (TID 14, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:20:00 INFO Executor: Running task 0.0 in stage 7.0 (TID 14)
16/03/03 22:20:00 INFO Executor: Finished task 0.0 in stage 7.0 (TID 14). 1057 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Starting task 1.0 in stage 7.0 (TID 15, localhost, PROCESS_LOCAL, 2322 bytes)
16/03/03 22:20:00 INFO Executor: Running task 1.0 in stage 7.0 (TID 15)
16/03/03 22:20:00 INFO TaskSetManager: Finished task 0.0 in stage 7.0 (TID 14) in 9 ms on localhost (1/2)
16/03/03 22:20:00 INFO Executor: Finished task 1.0 in stage 7.0 (TID 15). 1057 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Finished task 1.0 in stage 7.0 (TID 15) in 13 ms on localhost (2/2)
16/03/03 22:20:00 INFO DAGScheduler: ResultStage 7 (collect at Transformation1.scala:20) finished in 0.020 s
16/03/03 22:20:00 INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose tasks have all completed, from pool 
16/03/03 22:20:00 INFO DAGScheduler: Job 7 finished: collect at Transformation1.scala:20, took 0.032942 s
a:1
b:1
c:1
d:1
16/03/03 22:20:00 INFO SparkContext: Starting job: foreach at Transformation1.scala:26
16/03/03 22:20:00 INFO DAGScheduler: Got job 8 (foreach at Transformation1.scala:26) with 1 output partitions
16/03/03 22:20:00 INFO DAGScheduler: Final stage: ResultStage 8(foreach at Transformation1.scala:26)
16/03/03 22:20:00 INFO DAGScheduler: Parents of final stage: List()
16/03/03 22:20:00 INFO DAGScheduler: Missing parents: List()
16/03/03 22:20:00 INFO DAGScheduler: Submitting ResultStage 8 (MapPartitionsRDD[8] at filter at Transformation1.scala:26), which has no missing parents
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1816) called with curMem=25204, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_8 stored as values in memory (estimated size 1816.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1143) called with curMem=27020, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 1143.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on localhost:50609 (size: 1143.0 B, free: 730.6 MB)
16/03/03 22:20:00 INFO SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:00 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 8 (MapPartitionsRDD[8] at filter at Transformation1.scala:26)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Adding task set 8.0 with 1 tasks
16/03/03 22:20:00 INFO TaskSetManager: Starting task 0.0 in stage 8.0 (TID 16, localhost, PROCESS_LOCAL, 2227 bytes)
16/03/03 22:20:00 INFO Executor: Running task 0.0 in stage 8.0 (TID 16)
16/03/03 22:20:00 INFO Executor: Finished task 0.0 in stage 8.0 (TID 16). 915 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Finished task 0.0 in stage 8.0 (TID 16) in 15 ms on localhost (1/1)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Removed TaskSet 8.0, whose tasks have all completed, from pool 
16/03/03 22:20:00 INFO DAGScheduler: ResultStage 8 (foreach at Transformation1.scala:26) finished in 0.028 s
a3:a:1
a3:b:1
a3:a:1
16/03/03 22:20:00 INFO DAGScheduler: Job 8 finished: foreach at Transformation1.scala:26, took 0.097583 s
16/03/03 22:20:00 INFO SparkContext: Starting job: foreach at Transformation1.scala:27
16/03/03 22:20:00 INFO DAGScheduler: Registering RDD 5 (distinct at Transformation1.scala:25)
16/03/03 22:20:00 INFO DAGScheduler: Got job 9 (foreach at Transformation1.scala:27) with 1 output partitions
16/03/03 22:20:00 INFO DAGScheduler: Final stage: ResultStage 10(foreach at Transformation1.scala:27)
16/03/03 22:20:00 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 9)
16/03/03 22:20:00 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 9)
16/03/03 22:20:00 INFO DAGScheduler: Submitting ShuffleMapStage 9 (MapPartitionsRDD[5] at distinct at Transformation1.scala:25), which has no missing parents
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(2560) called with curMem=28163, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_9 stored as values in memory (estimated size 2.5 KB, free 730.6 MB)
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1523) called with curMem=30723, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes in memory (estimated size 1523.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on localhost:50609 (size: 1523.0 B, free: 730.6 MB)
16/03/03 22:20:00 INFO SparkContext: Created broadcast 9 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:00 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 9 (MapPartitionsRDD[5] at distinct at Transformation1.scala:25)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Adding task set 9.0 with 1 tasks
16/03/03 22:20:00 INFO TaskSetManager: Starting task 0.0 in stage 9.0 (TID 17, localhost, PROCESS_LOCAL, 2216 bytes)
16/03/03 22:20:00 INFO Executor: Running task 0.0 in stage 9.0 (TID 17)
16/03/03 22:20:00 INFO Executor: Finished task 0.0 in stage 9.0 (TID 17). 1158 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Finished task 0.0 in stage 9.0 (TID 17) in 62 ms on localhost (1/1)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Removed TaskSet 9.0, whose tasks have all completed, from pool 
16/03/03 22:20:00 INFO DAGScheduler: ShuffleMapStage 9 (distinct at Transformation1.scala:25) finished in 0.065 s
16/03/03 22:20:00 INFO DAGScheduler: looking for newly runnable stages
16/03/03 22:20:00 INFO DAGScheduler: running: Set()
16/03/03 22:20:00 INFO DAGScheduler: waiting: Set(ResultStage 10)
16/03/03 22:20:00 INFO DAGScheduler: failed: Set()
16/03/03 22:20:00 INFO DAGScheduler: Missing parents for ResultStage 10: List()
16/03/03 22:20:00 INFO DAGScheduler: Submitting ResultStage 10 (MapPartitionsRDD[9] at filter at Transformation1.scala:27), which is now runnable
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(2920) called with curMem=32246, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_10 stored as values in memory (estimated size 2.9 KB, free 730.6 MB)
16/03/03 22:20:00 INFO MemoryStore: ensureFreeSpace(1656) called with curMem=35166, maxMem=766075207
16/03/03 22:20:00 INFO MemoryStore: Block broadcast_10_piece0 stored as bytes in memory (estimated size 1656.0 B, free 730.6 MB)
16/03/03 22:20:00 INFO BlockManagerInfo: Added broadcast_10_piece0 in memory on localhost:50609 (size: 1656.0 B, free: 730.6 MB)
16/03/03 22:20:00 INFO SparkContext: Created broadcast 10 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:00 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 10 (MapPartitionsRDD[9] at filter at Transformation1.scala:27)
16/03/03 22:20:00 INFO TaskSchedulerImpl: Adding task set 10.0 with 1 tasks
16/03/03 22:20:00 INFO TaskSetManager: Starting task 0.0 in stage 10.0 (TID 18, localhost, PROCESS_LOCAL, 1901 bytes)
16/03/03 22:20:00 INFO Executor: Running task 0.0 in stage 10.0 (TID 18)
16/03/03 22:20:00 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
16/03/03 22:20:00 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 9 ms
16/03/03 22:20:00 INFO Executor: Finished task 0.0 in stage 10.0 (TID 18). 1165 bytes result sent to driver
16/03/03 22:20:00 INFO TaskSetManager: Finished task 0.0 in stage 10.0 (TID 18) in 101 ms on localhost (1/1)
16/03/03 22:20:00 INFO DAGScheduler: ResultStage 10 (foreach at Transformation1.scala:27) finished in 0.102 s
16/03/03 22:20:00 INFO TaskSchedulerImpl: Removed TaskSet 10.0, whose tasks have all completed, from pool 
r2:b:1
r2:a:1
16/03/03 22:20:00 INFO DAGScheduler: Job 9 finished: foreach at Transformation1.scala:27, took 0.215763 s
16/03/03 22:20:01 INFO SparkContext: Starting job: foreach at Transformation1.scala:31
16/03/03 22:20:01 INFO DAGScheduler: Registering RDD 3 (parallelize at Transformation1.scala:23)
16/03/03 22:20:01 INFO DAGScheduler: Registering RDD 4 (parallelize at Transformation1.scala:24)
16/03/03 22:20:01 INFO DAGScheduler: Got job 10 (foreach at Transformation1.scala:31) with 1 output partitions
16/03/03 22:20:01 INFO DAGScheduler: Final stage: ResultStage 13(foreach at Transformation1.scala:31)
16/03/03 22:20:01 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 12, ShuffleMapStage 11)
16/03/03 22:20:01 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 12, ShuffleMapStage 11)
16/03/03 22:20:01 INFO DAGScheduler: Submitting ShuffleMapStage 11 (ParallelCollectionRDD[3] at parallelize at Transformation1.scala:23), which has no missing parents
16/03/03 22:20:01 INFO MemoryStore: ensureFreeSpace(1520) called with curMem=36822, maxMem=766075207
16/03/03 22:20:01 INFO MemoryStore: Block broadcast_11 stored as values in memory (estimated size 1520.0 B, free 730.5 MB)
16/03/03 22:20:01 INFO MemoryStore: ensureFreeSpace(980) called with curMem=38342, maxMem=766075207
16/03/03 22:20:01 INFO MemoryStore: Block broadcast_11_piece0 stored as bytes in memory (estimated size 980.0 B, free 730.5 MB)
16/03/03 22:20:01 INFO BlockManagerInfo: Added broadcast_11_piece0 in memory on localhost:50609 (size: 980.0 B, free: 730.6 MB)
16/03/03 22:20:01 INFO SparkContext: Created broadcast 11 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:01 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 11 (ParallelCollectionRDD[3] at parallelize at Transformation1.scala:23)
16/03/03 22:20:01 INFO TaskSchedulerImpl: Adding task set 11.0 with 1 tasks
16/03/03 22:20:01 INFO DAGScheduler: Submitting ShuffleMapStage 12 (ParallelCollectionRDD[4] at parallelize at Transformation1.scala:24), which has no missing parents
16/03/03 22:20:01 INFO MemoryStore: ensureFreeSpace(1520) called with curMem=39322, maxMem=766075207
16/03/03 22:20:01 INFO MemoryStore: Block broadcast_12 stored as values in memory (estimated size 1520.0 B, free 730.5 MB)
16/03/03 22:20:01 INFO MemoryStore: ensureFreeSpace(984) called with curMem=40842, maxMem=766075207
16/03/03 22:20:01 INFO MemoryStore: Block broadcast_12_piece0 stored as bytes in memory (estimated size 984.0 B, free 730.5 MB)
16/03/03 22:20:01 INFO BlockManagerInfo: Added broadcast_12_piece0 in memory on localhost:50609 (size: 984.0 B, free: 730.6 MB)
16/03/03 22:20:01 INFO TaskSetManager: Starting task 0.0 in stage 11.0 (TID 19, localhost, PROCESS_LOCAL, 2216 bytes)
16/03/03 22:20:01 INFO Executor: Running task 0.0 in stage 11.0 (TID 19)
16/03/03 22:20:01 INFO SparkContext: Created broadcast 12 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:01 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 12 (ParallelCollectionRDD[4] at parallelize at Transformation1.scala:24)
16/03/03 22:20:01 INFO TaskSchedulerImpl: Adding task set 12.0 with 1 tasks
16/03/03 22:20:01 INFO Executor: Finished task 0.0 in stage 11.0 (TID 19). 1158 bytes result sent to driver
16/03/03 22:20:01 INFO TaskSetManager: Starting task 0.0 in stage 12.0 (TID 20, localhost, PROCESS_LOCAL, 2272 bytes)
16/03/03 22:20:01 INFO Executor: Running task 0.0 in stage 12.0 (TID 20)
16/03/03 22:20:01 INFO TaskSetManager: Finished task 0.0 in stage 11.0 (TID 19) in 54 ms on localhost (1/1)
16/03/03 22:20:01 INFO TaskSchedulerImpl: Removed TaskSet 11.0, whose tasks have all completed, from pool 
16/03/03 22:20:01 INFO DAGScheduler: ShuffleMapStage 11 (parallelize at Transformation1.scala:23) finished in 0.060 s
16/03/03 22:20:01 INFO DAGScheduler: looking for newly runnable stages
16/03/03 22:20:01 INFO DAGScheduler: running: Set(ShuffleMapStage 12)
16/03/03 22:20:01 INFO DAGScheduler: waiting: Set(ResultStage 13)
16/03/03 22:20:01 INFO DAGScheduler: failed: Set()
16/03/03 22:20:01 INFO DAGScheduler: Missing parents for ResultStage 13: List(ShuffleMapStage 12)
16/03/03 22:20:01 INFO Executor: Finished task 0.0 in stage 12.0 (TID 20). 1158 bytes result sent to driver
16/03/03 22:20:01 INFO TaskSetManager: Finished task 0.0 in stage 12.0 (TID 20) in 24 ms on localhost (1/1)
16/03/03 22:20:01 INFO TaskSchedulerImpl: Removed TaskSet 12.0, whose tasks have all completed, from pool 
16/03/03 22:20:01 INFO DAGScheduler: ShuffleMapStage 12 (parallelize at Transformation1.scala:24) finished in 0.061 s
16/03/03 22:20:01 INFO DAGScheduler: looking for newly runnable stages
16/03/03 22:20:01 INFO DAGScheduler: running: Set()
16/03/03 22:20:01 INFO DAGScheduler: waiting: Set(ResultStage 13)
16/03/03 22:20:01 INFO DAGScheduler: failed: Set()
16/03/03 22:20:01 INFO DAGScheduler: Missing parents for ResultStage 13: List()
16/03/03 22:20:01 INFO DAGScheduler: Submitting ResultStage 13 (MapPartitionsRDD[13] at filter at Transformation1.scala:31), which is now runnable
16/03/03 22:20:01 INFO MemoryStore: ensureFreeSpace(2960) called with curMem=41826, maxMem=766075207
16/03/03 22:20:01 INFO MemoryStore: Block broadcast_13 stored as values in memory (estimated size 2.9 KB, free 730.5 MB)
16/03/03 22:20:01 INFO MemoryStore: ensureFreeSpace(1618) called with curMem=44786, maxMem=766075207
16/03/03 22:20:01 INFO MemoryStore: Block broadcast_13_piece0 stored as bytes in memory (estimated size 1618.0 B, free 730.5 MB)
16/03/03 22:20:01 INFO BlockManagerInfo: Added broadcast_13_piece0 in memory on localhost:50609 (size: 1618.0 B, free: 730.6 MB)
16/03/03 22:20:01 INFO SparkContext: Created broadcast 13 from broadcast at DAGScheduler.scala:861
16/03/03 22:20:01 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 13 (MapPartitionsRDD[13] at filter at Transformation1.scala:31)
16/03/03 22:20:01 INFO TaskSchedulerImpl: Adding task set 13.0 with 1 tasks
16/03/03 22:20:01 INFO TaskSetManager: Starting task 0.0 in stage 13.0 (TID 21, localhost, PROCESS_LOCAL, 1974 bytes)
16/03/03 22:20:01 INFO Executor: Running task 0.0 in stage 13.0 (TID 21)
16/03/03 22:20:01 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
16/03/03 22:20:01 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
16/03/03 22:20:01 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
16/03/03 22:20:01 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
j1:a:1:1
j1:a:1:2
j1:a:1:1
j1:a:1:2
j1:b:1:1
j1:b:1:2
j1:b:1:3
16/03/03 22:20:01 INFO Executor: Finished task 0.0 in stage 13.0 (TID 21). 1165 bytes result sent to driver
16/03/03 22:20:01 INFO TaskSetManager: Finished task 0.0 in stage 13.0 (TID 21) in 46 ms on localhost (1/1)
16/03/03 22:20:01 INFO DAGScheduler: ResultStage 13 (foreach at Transformation1.scala:31) finished in 0.046 s
16/03/03 22:20:01 INFO TaskSchedulerImpl: Removed TaskSet 13.0, whose tasks have all completed, from pool 
16/03/03 22:20:01 INFO DAGScheduler: Job 10 finished: foreach at Transformation1.scala:31, took 0.154299 s
16/03/03 22:20:01 INFO SparkContext: Invoking stop() from shutdown hook
16/03/03 22:20:01 INFO SparkUI: Stopped Spark web UI at http://202.38.84.241:4040
16/03/03 22:20:01 INFO DAGScheduler: Stopping DAGScheduler
16/03/03 22:20:01 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/03 22:20:01 INFO MemoryStore: MemoryStore cleared
16/03/03 22:20:01 INFO BlockManager: BlockManager stopped
16/03/03 22:20:01 INFO BlockManagerMaster: BlockManagerMaster stopped
16/03/03 22:20:01 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/03/03 22:20:01 INFO SparkContext: Successfully stopped SparkContext
16/03/03 22:20:01 INFO ShutdownHookManager: Shutdown hook called
16/03/03 22:20:01 INFO ShutdownHookManager: Deleting directory C:\Users\xubo\AppData\Local\Temp\spark-c2b62943-1802-4092-a678-704b95d428ca

Process finished with exit code 0


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值