spark 集群模式读写mysql 问题的处理

时间:20150313

现象:使用local 模式提交正常,但是调整为集群模式出现空指针异常。使用./bin/spark-submit --master spark://jt-host-kvm-17:7077 --classparkMysql.ParkInCountMysql --executor-memory 300m    /httx/work/work.jar local--driver-class-path /httx/work/mysql-connector-java-5.1.21.jar  这个正常,使用下面的语句异常。./bin/spark-submit--master spark://jt-host-kvm-17:7077 --class parkMysql.ParkInCountMysql--executor-memory 300m   /httx/work/work.jar spark://jt-host-kvm-17:7077 --driver-class-path/httx/work/mysql-connector-java-5.1.21.jar

Spark assembly has been built with Hive, includingDatanucleus jars on classpath

15/03/13 09:19:19 INFO spark.SecurityManager: Changingview acls to: hduser,

15/03/13 09:19:19 INFO spark.SecurityManager: Changingmodify acls to: hduser,

15/03/13 09:19:19 INFO spark.SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with viewpermissions: Set(hduser, ); users with modify permissions: Set(hduser, )

15/03/13 09:19:19 INFO slf4j.Slf4jLogger: Slf4jLoggerstarted

15/03/13 09:19:19 INFO Remoting: Starting remoting

15/03/13 09:19:19 INFO Remoting: Remoting started;listening on addresses :[akka.tcp://sparkDriver@jt-host-kvm-17:41594]

15/03/13 09:19:19 INFO Remoting: Remoting now listenson addresses: [akka.tcp://sparkDriver@jt-host-kvm-17:41594]

15/03/13 09:19:19 INFO util.Utils: Successfullystarted service 'sparkDriver' on port 41594.

15/03/13 09:19:19 INFO spark.SparkEnv: RegisteringMapOutputTracker

15/03/13 09:19:19 INFO spark.SparkEnv: RegisteringBlockManagerMaster

15/03/13 09:19:19 INFO storage.DiskBlockManager:Created local directory at /tmp/spark-local-20150313091919-279b

15/03/13 09:19:19 INFO util.Utils: Successfullystarted service 'Connection manager for block manager' on port 50750.

15/03/13 09:19:19 INFO network.ConnectionManager:Bound socket to port 50750 with id = ConnectionManagerId(jt-host-kvm-17,50750)

15/03/13 09:19:19 INFO storage.MemoryStore:MemoryStore started with capacity 265.4 MB

15/03/13 09:19:19 INFO storage.BlockManagerMaster:Trying to register BlockManager

15/03/13 09:19:19 INFOstorage.BlockManagerMasterActor: Registering block manager jt-host-kvm-17:50750with 265.4 MB RAM

15/03/13 09:19:19 INFO storage.BlockManagerMaster:Registered BlockManager

15/03/13 09:19:19 INFO spark.HttpFileServer: HTTP Fileserver directory is /tmp/spark-d81926b3-e51a-4c54-b2bb-da139a9a413d

15/03/13 09:19:19 INFO spark.HttpServer: Starting HTTPServer

15/03/13 09:19:19 INFO server.Server:jetty-8.y.z-SNAPSHOT

15/03/13 09:19:19 INFO server.AbstractConnector:Started SocketConnector@0.0.0.0:43888

15/03/13 09:19:19 INFO util.Utils: Successfullystarted service 'HTTP file server' on port 43888.

15/03/13 09:19:20 INFO server.Server:jetty-8.y.z-SNAPSHOT

15/03/13 09:19:20 INFO server.AbstractConnector:Started SelectChannelConnector@0.0.0.0:4040

15/03/13 09:19:20 INFO util.Utils: Successfullystarted service 'SparkUI' on port 4040.

15/03/13 09:19:20 INFO ui.SparkUI: Started SparkUI athttp://jt-host-kvm-17:4040

15/03/13 09:19:20 WARN util.NativeCodeLoader: Unableto load native-hadoop library for your platform... using builtin-java classeswhere applicable

15/03/13 09:19:20 INFO spark.SparkContext: Added JARfile:/httx/work/work.jar at http://10.7.12.117:43888/jars/work.jar withtimestamp 1426209560636

15/03/13 09:19:20 INFO client.AppClient$ClientActor:Connecting to master spark://jt-host-kvm-17:7077...

15/03/13 09:19:20 INFOcluster.SparkDeploySchedulerBackend: SchedulerBackend is ready for schedulingbeginning after reached minRegisteredResourcesRatio: 0.0

15/03/13 09:19:20 INFO spark.SparkContext: Startingjob: collectAsMap at ParkInCountMysql.scala:32

15/03/13 09:19:20 INFO scheduler.DAGScheduler:Registering RDD 2 (map at ParkInCountMysql.scala:29)

15/03/13 09:19:20 INFO scheduler.DAGScheduler: Got job0 (collectAsMap at ParkInCountMysql.scala:32) with 1 output partitions(allowLocal=false)

15/03/13 09:19:20 INFO scheduler.DAGScheduler: Finalstage: Stage 0(collectAsMap at ParkInCountMysql.scala:32)

15/03/13 09:19:20 INFO scheduler.DAGScheduler: Parentsof final stage: List(Stage 1)

15/03/13 09:19:20 INFO scheduler.DAGScheduler: Missingparents: List(Stage 1)

15/03/13 09:19:20 INFO scheduler.DAGScheduler:Submitting Stage 1 (MappedRDD[2] at map at ParkInCountMysql.scala:29), whichhas no missing parents

15/03/13 09:19:20 INFOcluster.SparkDeploySchedulerBackend: Connected to Spark cluster with app IDapp-20150313091920-0008

15/03/13 09:19:20 INFO client.AppClient$ClientActor:Executor added: app-20150313091920-0008/0 on worker-20150305092118-jt-host-kvm-18-59151(jt-host-kvm-18:59151) with 16 cores

15/03/13 09:19:20 INFOcluster.SparkDeploySchedulerBackend: Granted executor IDapp-20150313091920-0008/0 on hostPort jt-host-kvm-18:59151 with 16 cores, 300.0MB RAM

15/03/13 09:19:20 INFO client.AppClient$ClientActor:Executor added: app-20150313091920-0008/1 onworker-20150305092118-jt-host-kvm-17-50115 (jt-host-kvm-17:50115) with 16 cores

15/03/13 09:19:20 INFOcluster.SparkDeploySchedulerBackend: Granted executor IDapp-20150313091920-0008/1 on hostPort jt-host-kvm-17:50115 with 16 cores, 300.0MB RAM

15/03/13 09:19:20 INFO client.AppClient$ClientActor:Executor added: app-20150313091920-0008/2 onworker-20150305092118-jt-host-kvm-19-48800 (jt-host-kvm-19:48800) with 16 cores

15/03/13 09:19:20 INFO cluster.SparkDeploySchedulerBackend:Granted executor ID app-20150313091920-0008/2 on hostPort jt-host-kvm-19:48800with 16 cores, 300.0 MB RAM

15/03/13 09:19:20 INFO client.AppClient$ClientActor:Executor updated: app-20150313091920-0008/1 is now RUNNING

15/03/13 09:19:20 INFO client.AppClient$ClientActor:Executor updated: app-20150313091920-0008/0 is now RUNNING

15/03/13 09:19:20 INFO client.AppClient$ClientActor:Executor updated: app-20150313091920-0008/2 is now RUNNING

15/03/13 09:19:20 INFO storage.MemoryStore:ensureFreeSpace(2832) called with curMem=0, maxMem=278302556

15/03/13 09:19:20 INFO storage.MemoryStore: Blockbroadcast_0 stored as values in memory (estimated size 2.8 KB, free 265.4 MB)

15/03/13 09:19:20 INFO storage.MemoryStore:ensureFreeSpace(1651) called with curMem=2832, maxMem=278302556

15/03/13 09:19:20 INFO storage.MemoryStore: Blockbroadcast_0_piece0 stored as bytes in memory (estimated size 1651.0 B, free265.4 MB)

15/03/13 09:19:20 INFO storage.BlockManagerInfo: Addedbroadcast_0_piece0 in memory on jt-host-kvm-17:50750 (size: 1651.0 B, free:265.4 MB)

15/03/13 09:19:20 INFO storage.BlockManagerMaster:Updated info of block broadcast_0_piece0

15/03/13 09:19:20 INFO scheduler.DAGScheduler:Submitting 1 missing tasks from Stage 1 (MappedRDD[2] at map atParkInCountMysql.scala:29)

15/03/13 09:19:20 INFO scheduler.TaskSchedulerImpl:Adding task set 1.0 with 1 tasks

15/03/13 09:19:23 INFOcluster.SparkDeploySchedulerBackend: Registered executor:Actor[akka.tcp://sparkExecutor@jt-host-kvm-19:41576/user/Executor#1766662750]with ID 2

15/03/13 09:19:23 INFO scheduler.TaskSetManager:Starting task 0.0 in stage 1.0 (TID 0, jt-host-kvm-19, PROCESS_LOCAL, 998bytes)

15/03/13 09:19:23 INFOcluster.SparkDeploySchedulerBackend: Registered executor:Actor[akka.tcp://sparkExecutor@jt-host-kvm-17:45787/user/Executor#-210267352]with ID 1

15/03/13 09:19:23 INFO cluster.SparkDeploySchedulerBackend:Registered executor:Actor[akka.tcp://sparkExecutor@jt-host-kvm-18:38188/user/Executor#-1134749150]with ID 0

15/03/13 09:19:23 INFOstorage.BlockManagerMasterActor: Registering block manager jt-host-kvm-19:39020with 155.3 MB RAM

15/03/13 09:19:23 INFOstorage.BlockManagerMasterActor: Registering block manager jt-host-kvm-18:33649with 155.3 MB RAM

15/03/13 09:19:23 INFOstorage.BlockManagerMasterActor: Registering block manager jt-host-kvm-17:57066with 155.3 MB RAM

15/03/13 09:19:23 INFO network.ConnectionManager:Accepted connection from [jt-host-kvm-19/10.7.12.119:42519]

15/03/13 09:19:23 INFO network.SendingConnection:Initiating connection to [jt-host-kvm-19/10.7.12.119:39020]

15/03/13 09:19:23 INFO network.SendingConnection:Connected to [jt-host-kvm-19/10.7.12.119:39020], 1 messages pending

15/03/13 09:19:24 INFO storage.BlockManagerInfo: Addedbroadcast_0_piece0 in memory on jt-host-kvm-19:39020 (size: 1651.0 B, free:155.2 MB)

15/03/13 09:19:24 WARN scheduler.TaskSetManager: Losttask 0.0 in stage 1.0 (TID 0, jt-host-kvm-19): java.lang.NullPointerException:

       org.apache.spark.rdd.JdbcRDD$$anon$1.<init>(JdbcRDD.scala:74)

       org.apache.spark.rdd.JdbcRDD.compute(JdbcRDD.scala:70)

       org.apache.spark.rdd.JdbcRDD.compute(JdbcRDD.scala:50)

       org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)

       org.apache.spark.rdd.RDD.iterator(RDD.scala:229)

       org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)

       org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)

       org.apache.spark.rdd.RDD.iterator(RDD.scala:229)

       org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)

       org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)

       org.apache.spark.rdd.RDD.iterator(RDD.scala:229)

       org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

       org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

       org.apache.spark.scheduler.Task.run(Task.scala:54)

       org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

       java.lang.Thread.run(Thread.java:744)

15/03/13 09:19:24 INFO scheduler.TaskSetManager:Starting task 0.1 in stage 1.0 (TID 1, jt-host-kvm-17, PROCESS_LOCAL, 998bytes)

15/03/13 09:19:24 INFO storage.BlockManagerInfo: Addedbroadcast_0_piece0 in memory on jt-host-kvm-17:57066 (size: 1651.0 B, free:155.2 MB)

15/03/13 09:19:24 INFO scheduler.TaskSetManager: Losttask 0.1 in stage 1.0 (TID 1) on executor jt-host-kvm-17:java.lang.NullPointerException (null) [duplicate 1]

15/03/13 09:19:24 INFO scheduler.TaskSetManager:Starting task 0.2 in stage 1.0 (TID 2, jt-host-kvm-19, PROCESS_LOCAL, 998bytes)

15/03/13 09:19:24 INFO scheduler.TaskSetManager: Losttask 0.2 in stage 1.0 (TID 2) on executor jt-host-kvm-19:java.lang.NullPointerException (null) [duplicate 2]

15/03/13 09:19:24 INFO scheduler.TaskSetManager:Starting task 0.3 in stage 1.0 (TID 3, jt-host-kvm-17, PROCESS_LOCAL, 998bytes)

15/03/13 09:19:24 INFO scheduler.TaskSetManager: Losttask 0.3 in stage 1.0 (TID 3) on executor jt-host-kvm-17: java.lang.NullPointerException(null) [duplicate 3]

15/03/13 09:19:24 ERROR scheduler.TaskSetManager: Task0 in stage 1.0 failed 4 times; aborting job

15/03/13 09:19:24 INFO scheduler.TaskSchedulerImpl:Removed TaskSet 1.0, whose tasks have all completed, from pool

15/03/13 09:19:24 INFO scheduler.TaskSchedulerImpl:Cancelling stage 1

15/03/13 09:19:24 INFO scheduler.DAGScheduler: Failedto run collectAsMap at ParkInCountMysql.scala:32

Exception in thread "main"org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 instage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID3, jt-host-kvm-17): java.lang.NullPointerException:

       org.apache.spark.rdd.JdbcRDD$$anon$1.<init>(JdbcRDD.scala:74)

       org.apache.spark.rdd.JdbcRDD.compute(JdbcRDD.scala:70)

       org.apache.spark.rdd.JdbcRDD.compute(JdbcRDD.scala:50)

       org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)

       org.apache.spark.rdd.RDD.iterator(RDD.scala:229)

       org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)

       org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)

       org.apache.spark.rdd.RDD.iterator(RDD.scala:229)

       org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)

       org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)

       org.apache.spark.rdd.RDD.iterator(RDD.scala:229)

       org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

       org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

       org.apache.spark.scheduler.Task.run(Task.scala:54)

       org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

       java.lang.Thread.run(Thread.java:744)

Driver stacktrace:

        atorg.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)

        atorg.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)

        atorg.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)

        atscala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

        atscala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

        atorg.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)

        atorg.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)

        atorg.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)

        atscala.Option.foreach(Option.scala:236)

        atorg.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)

        atorg.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)

        atakka.actor.ActorCell.receiveMessage(ActorCell.scala:498)

        atakka.actor.ActorCell.invoke(ActorCell.scala:456)

        atakka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)

        atakka.dispatch.Mailbox.run(Mailbox.scala:219)

        atakka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)

        atscala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)

        atscala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)

        atscala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

        atscala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

                    

处理方式:将 mysql jdbc 包放到—jars 参数中即可 ./bin/spark-submit--master spark://jt-host-kvm-17:7077 --class parkMysql.ParkInCountMysql--executor-memory 300m  --jars/httx/work/mysql-connector-java-5.1.21.jar /httx/work/work.jar spark://jt-host-kvm-17:7077    

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值