Spark开发环境异常处理(Connection timed out)
2020-04-02 22:48:47,973 [Executor task launch worker-0] WARN org.apache.hadoop.hdfs.DFSClient [DFSInputStream.java :
571] - Failed to connect to /172.18.0.5:50010 for block, add to deadNodes and continue. java.net.ConnectException:
Connection timed out: no further information
java.net.ConnectException:
Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.hdfs.DFSInputStream.newTcpPeer(DFSInputStream.java:955)
at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1107)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:533)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1760)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
我的开发环境是:本地访问A服务器,然后转发到HDFS服务器。
主要我们要抓住这个报错:Failed to connect to /172.18.0.5:50010 for block
首先我们可以测试:
本地ping 172.18.0.5
肯定时无法ping通的。
所以需要我们在本地添加转发映射以及关闭A服务器的防火墙。
1、本地Win(管理员运行CMD):
route -p add 172.18.0.0 mask 255.255.255.0 192.168.1.107
192.168.1.107为服务器A的IP ,添加后测试ping OK
2、关闭服务器防火墙:
ubantu: sudo ufw disable
注意:
可能你在添加的时候还是无法ping通。多添加几次以及删除。以及最后一个网段一定要为0
route delete 172.18.0.0 //删除
然后在添加
route -p add 172.18.0.0 mask 255.255.255.0 192.168.1.107 //添加
运行后,没有报错准确的计算出比列。