最近在尝试通过python和java在本地远程调用spark,具体如何用java远程调用spark,可参考我另一篇博客:
Idea基于maven,java语言的spark环境搭建
实验环境
client | 集群 |
---|---|
centos单机 | 4个hadoop节点集群 ,ip分别为192.168.20.[61,62,63,64],61为主节点 |
问题描述
代码其实比较简单,java代码如下,主要意思是访问我的hdfs上的文件,并统计行数。
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
public class JavaTest {
public static final String master = "spark://192.168.20.61:7077";
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("demo").setMaster(master);
JavaSparkContext sc = new JavaSparkContext(conf);
System.out.println(sc.textFile("hdfs://192.168.20.61:9000/flume.exec.log").count());
sc.stop();
}
}
运行时报一些错误,其中循环的比较多,我摘出主要错误信息如下,完整错误信息可查看附录。这个错误信息大致意思是,Executor成功添加后,然后就会down掉,进而继续添加,再down掉,再添加这种反复的过程。
18/10/31 15:38:07 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/3 on worker-20181029153030-192.168.20.63-48117 (192.168.20.63:48117) with 8 cores
18/10/31 15:38:07 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/3 on hostPort 192.168.20.63:48117 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:07 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/3 is now RUNNING
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/0 is now EXITED (Command exited with code 1)
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/0 removed: Command exited with code 1
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 0
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/4 on worker-20181029153028-192.168.20.62-36690 (192.168.20.62:36690) with 8 cores
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/4 on hostPort 192.168.20.62:36690 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/4 is now RUNNING
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/2 is now EXITED (Command exited with code 1)
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/2 removed: Command exited with code 1
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 2
我的解决方案
考虑到通信问题,既然是本地来获得集群的executor状态,有可能是自己无法接受集群传递过来的状态,可考虑将本地的防火墙关掉,尝试看看是否能解决问题,我的问题就是由于本地没有关闭防火墙导致的。
关掉防火墙命令如下:
su root
service iptables stop
后记
自己当时查了很多资料,有各种各样的解决方案,有说java无法执行的,有说数据量太大的,我遇到的只是通信问题,大家可以尝试一下,其余解决方案连接如下:
附录
- 错误信息完整代码
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/10/31 15:38:04 INFO SparkContext: Running Spark version 1.6.1
18/10/31 15:38:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/10/31 15:38:05 INFO SecurityManager: Changing view acls to: hadoop
18/10/31 15:38:05 INFO SecurityManager: Changing modify acls to: hadoop
18/10/31 15:38:05 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
18/10/31 15:38:05 INFO Utils: Successfully started service 'sparkDriver' on port 43676.
18/10/31 15:38:05 INFO Slf4jLogger: Slf4jLogger started
18/10/31 15:38:05 INFO Remoting: Starting remoting
18/10/31 15:38:05 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.20.210:60649]
18/10/31 15:38:05 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 60649.
18/10/31 15:38:05 INFO SparkEnv: Registering MapOutputTracker
18/10/31 15:38:05 INFO SparkEnv: Registering BlockManagerMaster
18/10/31 15:38:05 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-5e7f6678-df22-4f4a-aeec-dd642cba5ad0
18/10/31 15:38:05 INFO MemoryStore: MemoryStore started with capacity 1077.8 MB
18/10/31 15:38:05 INFO SparkEnv: Registering OutputCommitCoordinator
18/10/31 15:38:05 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/10/31 15:38:05 INFO SparkUI: Started SparkUI at http://192.168.20.210:4040
18/10/31 15:38:06 INFO AppClient$ClientEndpoint: Connecting to master spark://192.168.20.61:7077...
18/10/31 15:38:06 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20181031153851-0007
18/10/31 15:38:06 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/0 on worker-20181029153028-192.168.20.62-36690 (192.168.20.62:36690) with 8 cores
18/10/31 15:38:06 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/0 on hostPort 192.168.20.62:36690 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:06 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/1 on worker-20181029153030-192.168.20.63-48117 (192.168.20.63:48117) with 8 cores
18/10/31 15:38:06 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/1 on hostPort 192.168.20.63:48117 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:06 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/2 on worker-20181029153036-192.168.20.64-52254 (192.168.20.64:52254) with 8 cores
18/10/31 15:38:06 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/2 on hostPort 192.168.20.64:52254 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:06 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 54421.
18/10/31 15:38:06 INFO NettyBlockTransferService: Server created on 54421
18/10/31 15:38:06 INFO BlockManagerMaster: Trying to register BlockManager
18/10/31 15:38:06 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.20.210:54421 with 1077.8 MB RAM, BlockManagerId(driver, 192.168.20.210, 54421)
18/10/31 15:38:06 INFO BlockManagerMaster: Registered BlockManager
18/10/31 15:38:06 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/1 is now RUNNING
18/10/31 15:38:06 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/0 is now RUNNING
18/10/31 15:38:06 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/2 is now RUNNING
18/10/31 15:38:06 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
18/10/31 15:38:06 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 127.0 KB, free 127.0 KB)
18/10/31 15:38:06 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 15.5 KB, free 142.5 KB)
18/10/31 15:38:06 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.20.210:54421 (size: 15.5 KB, free: 1077.7 MB)
18/10/31 15:38:06 INFO SparkContext: Created broadcast 0 from textFile at JavaTest.java:9
18/10/31 15:38:06 INFO FileInputFormat: Total input paths to process : 1
18/10/31 15:38:06 INFO SparkContext: Starting job: count at JavaTest.java:9
18/10/31 15:38:06 INFO DAGScheduler: Got job 0 (count at JavaTest.java:9) with 2 output partitions
18/10/31 15:38:06 INFO DAGScheduler: Final stage: ResultStage 0 (count at JavaTest.java:9)
18/10/31 15:38:06 INFO DAGScheduler: Parents of final stage: List()
18/10/31 15:38:06 INFO DAGScheduler: Missing parents: List()
18/10/31 15:38:06 INFO DAGScheduler: Submitting ResultStage 0 (hdfs://192.168.20.61:9000/flume.exec.log MapPartitionsRDD[1] at textFile at JavaTest.java:9), which has no missing parents
18/10/31 15:38:06 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2.9 KB, free 145.4 KB)
18/10/31 15:38:06 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1798.0 B, free 147.2 KB)
18/10/31 15:38:06 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.20.210:54421 (size: 1798.0 B, free: 1077.7 MB)
18/10/31 15:38:06 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
18/10/31 15:38:06 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (hdfs://192.168.20.61:9000/flume.exec.log MapPartitionsRDD[1] at textFile at JavaTest.java:9)
18/10/31 15:38:06 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
18/10/31 15:38:07 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/1 is now EXITED (Command exited with code 1)
18/10/31 15:38:07 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/1 removed: Command exited with code 1
18/10/31 15:38:07 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 1
18/10/31 15:38:07 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/3 on worker-20181029153030-192.168.20.63-48117 (192.168.20.63:48117) with 8 cores
18/10/31 15:38:07 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/3 on hostPort 192.168.20.63:48117 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:07 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/3 is now RUNNING
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/0 is now EXITED (Command exited with code 1)
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/0 removed: Command exited with code 1
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 0
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/4 on worker-20181029153028-192.168.20.62-36690 (192.168.20.62:36690) with 8 cores
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/4 on hostPort 192.168.20.62:36690 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/4 is now RUNNING
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/2 is now EXITED (Command exited with code 1)
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/2 removed: Command exited with code 1
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 2
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/5 on worker-20181029153036-192.168.20.64-52254 (192.168.20.64:52254) with 8 cores
18/10/31 15:38:08 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/5 on hostPort 192.168.20.64:52254 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:08 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/5 is now RUNNING
18/10/31 15:38:09 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/3 is now EXITED (Command exited with code 1)
18/10/31 15:38:09 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/3 removed: Command exited with code 1
18/10/31 15:38:09 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 3
18/10/31 15:38:09 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/6 on worker-20181029153030-192.168.20.63-48117 (192.168.20.63:48117) with 8 cores
18/10/31 15:38:09 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/6 on hostPort 192.168.20.63:48117 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:09 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/6 is now RUNNING
18/10/31 15:38:09 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/4 is now EXITED (Command exited with code 1)
18/10/31 15:38:09 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/4 removed: Command exited with code 1
18/10/31 15:38:09 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 4
18/10/31 15:38:09 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/7 on worker-20181029153028-192.168.20.62-36690 (192.168.20.62:36690) with 8 cores
18/10/31 15:38:09 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/7 on hostPort 192.168.20.62:36690 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:09 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/7 is now RUNNING
18/10/31 15:38:09 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/5 is now EXITED (Command exited with code 1)
18/10/31 15:38:09 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/5 removed: Command exited with code 1
18/10/31 15:38:09 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 5
18/10/31 15:38:09 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/8 on worker-20181029153036-192.168.20.64-52254 (192.168.20.64:52254) with 8 cores
18/10/31 15:38:09 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/8 on hostPort 192.168.20.64:52254 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:09 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/8 is now RUNNING
18/10/31 15:38:11 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/7 is now EXITED (Command exited with code 1)
18/10/31 15:38:11 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/7 removed: Command exited with code 1
18/10/31 15:38:11 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 7
18/10/31 15:38:11 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/9 on worker-20181029153028-192.168.20.62-36690 (192.168.20.62:36690) with 8 cores
18/10/31 15:38:11 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/9 on hostPort 192.168.20.62:36690 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:11 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/9 is now RUNNING
18/10/31 15:38:14 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/6 is now EXITED (Command exited with code 1)
18/10/31 15:38:14 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/6 removed: Command exited with code 1
18/10/31 15:38:14 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 6
18/10/31 15:38:14 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/10 on worker-20181029153030-192.168.20.63-48117 (192.168.20.63:48117) with 8 cores
18/10/31 15:38:14 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/10 on hostPort 192.168.20.63:48117 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:14 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/10 is now RUNNING
18/10/31 15:38:14 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/8 is now EXITED (Command exited with code 1)
18/10/31 15:38:14 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/8 removed: Command exited with code 1
18/10/31 15:38:14 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 8
18/10/31 15:38:14 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/11 on worker-20181029153036-192.168.20.64-52254 (192.168.20.64:52254) with 8 cores
18/10/31 15:38:14 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/11 on hostPort 192.168.20.64:52254 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:14 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/11 is now RUNNING
18/10/31 15:38:16 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/9 is now EXITED (Command exited with code 1)
18/10/31 15:38:16 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/9 removed: Command exited with code 1
18/10/31 15:38:16 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 9
18/10/31 15:38:16 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/12 on worker-20181029153028-192.168.20.62-36690 (192.168.20.62:36690) with 8 cores
18/10/31 15:38:16 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/12 on hostPort 192.168.20.62:36690 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:16 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/12 is now RUNNING
18/10/31 15:38:19 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/10 is now EXITED (Command exited with code 1)
18/10/31 15:38:19 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/10 removed: Command exited with code 1
18/10/31 15:38:19 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 10
18/10/31 15:38:19 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/13 on worker-20181029153030-192.168.20.63-48117 (192.168.20.63:48117) with 8 cores
18/10/31 15:38:19 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/13 on hostPort 192.168.20.63:48117 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:19 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/13 is now RUNNING
18/10/31 15:38:19 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/11 is now EXITED (Command exited with code 1)
18/10/31 15:38:19 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/11 removed: Command exited with code 1
18/10/31 15:38:19 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 11
18/10/31 15:38:19 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/14 on worker-20181029153036-192.168.20.64-52254 (192.168.20.64:52254) with 8 cores
18/10/31 15:38:19 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/14 on hostPort 192.168.20.64:52254 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:19 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/14 is now RUNNING
18/10/31 15:38:21 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/12 is now EXITED (Command exited with code 1)
18/10/31 15:38:21 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/12 removed: Command exited with code 1
18/10/31 15:38:21 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 12
18/10/31 15:38:21 INFO AppClient$ClientEndpoint: Executor added: app-20181031153851-0007/15 on worker-20181029153028-192.168.20.62-36690 (192.168.20.62:36690) with 8 cores
18/10/31 15:38:21 INFO SparkDeploySchedulerBackend: Granted executor ID app-20181031153851-0007/15 on hostPort 192.168.20.62:36690 with 8 cores, 1024.0 MB RAM
18/10/31 15:38:21 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/15 is now RUNNING
18/10/31 15:38:21 INFO AppClient$ClientEndpoint: Executor updated: app-20181031153851-0007/13 is now EXITED (Command exited with code 1)
18/10/31 15:38:21 INFO SparkDeploySchedulerBackend: Executor app-20181031153851-0007/13 removed: Command exited with code 1
18/10/31 15:38:21 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 13