spark java 计数_spark count统计元素个数

太简单了,直接上代码,不解析

public static void myCount(){         SparkConf conf=new SparkConf()

.setMaster("local")

.setAppName("myCount");

JavaSparkContext sc=new JavaSparkContext(conf);

List list=Arrays.asList(1,2,3,4,4);

JavaRDD  listRdd=sc.parallelize(list, 2);

long counts=listRdd.count();

System.out.println("count:"+counts);

sc.close();

}

结果:

count:5

我始终觉得打印出来的日志很重要:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/05/03 22:54:42 INFO SparkContext: Running Spark version 1.6.1 16/05/03 22:55:02 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/05/03 22:55:02 INFO SecurityManager: Changing view acls to: admin 16/05/03 22:55:02 INFO SecurityManager: Changing modify acls to: admin 16/05/03 22:55:02 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); users with modify permissions: Set(admin) 16/05/03 22:55:04 INFO Utils: Successfully started service 'sparkDriver' on port 55095. 16/05/03 22:55:05 INFO Slf4jLogger: Slf4jLogger started 16/05/03 22:55:05 INFO Remoting: Starting remoting 16/05/03 22:55:05 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.213.1:55108] 16/05/03 22:55:06 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 55108. 16/05/03 22:55:06 INFO SparkEnv: Registering MapOutputTracker 16/05/03 22:55:06 INFO SparkEnv: Registering BlockManagerMaster 16/05/03 22:55:06 INFO DiskBlockManager: Created local directory at C:\Users\admin\AppData\Local\Temp\blockmgr-3aaa5046-0d05-4c75-8734-02b0121f3a1e 16/05/03 22:55:06 INFO MemoryStore: MemoryStore started with capacity 2.4 GB 16/05/03 22:55:06 INFO SparkEnv: Registering OutputCommitCoordinator 16/05/03 22:55:07 INFO Utils: Successfully started service 'SparkUI' on port 4040. 16/05/03 22:55:07 INFO SparkUI: Started SparkUI at http://192.168.213.1:4040 16/05/03 22:55:07 INFO Executor: Starting executor ID driver on host localhost 16/05/03 22:55:07 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55116. 16/05/03 22:55:07 INFO NettyBlockTransferService: Server created on 55116 16/05/03 22:55:07 INFO BlockManagerMaster: Trying to register BlockManager 16/05/03 22:55:07 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55116 with 2.4 GB RAM, BlockManagerId(driver, localhost, 55116) 16/05/03 22:55:07 INFO BlockManagerMaster: Registered BlockManager 16/05/03 22:55:09 INFO SparkContext: Starting job: count at ActionOperation.java:84 16/05/03 22:55:09 INFO DAGScheduler: Got job 0 (count at ActionOperation.java:84) with 2 output partitions 16/05/03 22:55:09 INFO DAGScheduler: Final stage: ResultStage 0 (count at ActionOperation.java:84) 16/05/03 22:55:09 INFO DAGScheduler: Parents of final stage: List() 16/05/03 22:55:09 INFO DAGScheduler: Missing parents: List() 16/05/03 22:55:09 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectionRDD[0] at parallelize at ActionOperation.java:83), which has no missing parents 16/05/03 22:55:10 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1320.0 B, free 1320.0 B) 16/05/03 22:55:10 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 927.0 B, free 2.2 KB) 16/05/03 22:55:10 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:55116 (size: 927.0 B, free: 2.4 GB) 16/05/03 22:55:10 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006 16/05/03 22:55:10 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (ParallelCollectionRDD[0] at parallelize at ActionOperation.java:83) 16/05/03 22:55:10 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 16/05/03 22:55:10 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 2140 bytes) 16/05/03 22:55:10 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 16/05/03 22:55:10 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 953 bytes result sent to driver 16/05/03 22:55:10 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 2145 bytes) 16/05/03 22:55:10 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 16/05/03 22:55:10 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 257 ms on localhost (1/2) 16/05/03 22:55:10 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 953 bytes result sent to driver 16/05/03 22:55:10 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 55 ms on localhost (2/2) 16/05/03 22:55:10 INFO DAGScheduler: ResultStage 0 (count at ActionOperation.java:84) finished in 0.352 s 16/05/03 22:55:10 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 16/05/03 22:55:10 INFO DAGScheduler: Job 0 finished: count at ActionOperation.java:84, took 1.136850 s count:5 16/05/03 22:55:10 INFO SparkUI: Stopped Spark web UI at http://192.168.213.1:4040 16/05/03 22:55:10 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/05/03 22:55:10 INFO MemoryStore: MemoryStore cleared 16/05/03 22:55:10 INFO BlockManager: BlockManager stopped 16/05/03 22:55:10 INFO BlockManagerMaster: BlockManagerMaster stopped 16/05/03 22:55:10 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/05/03 22:55:10 INFO SparkContext: Successfully stopped SparkContext 16/05/03 22:55:10 INFO ShutdownHookManager: Shutdown hook called 16/05/03 22:55:10 INFO ShutdownHookManager: Deleting directory C:\Users\admin\AppData\Local\Temp\spark-221cf0db-fb4a-4577-8785-ac392b53425e 16/05/03 22:55:10 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值