50030端口被占用的情况:
2011-05-1 14:30:43,931 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50030
2011-05-1 14:30:43,933 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Address already in use
在mapred-default.xml中修改下端口号:
<property> |
<name>mapred.job.tracker.http.address</name> |
<value> 0.0 . 0.0 : 50030 </value> |
<description>The job tracker http server address and port |
the server will listen on.If the port is 0 then the server |
will start on a free port. |
</description> |
</property> |
java.io.IOException: All datanodes xxx.xxx.xxx.xxx:xxx are bad. Aborting…
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2158)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
java.io.IOException: Could not get block locations. Aborting…
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
经查明,问题原因是linux机器打开了过多的文件导致。用命令ulimit -n可以发现linux默认的文件打开数目为1024,修改/ect/security/limit.conf,增加hadoop soft 65535
再重新运行程序(最好所有的datanode都修改),问题解决
P.S:据说hadoop dfs不能管理总数超过100M个文件,有待查证
启动hadoop报错如下:
java.io.IOException: File /exapp/hadoop/hadooptmp/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
此处没发现原因,然后查看了下datanode 上的日志:
************************************************************/
发现问题所在:由于我之前格式化一次namenode,所以导致namenode 和datanode 版本不一致。
解决方法:分别找到datanode下的
不同安装目录位置不一样。
然后,启动hadoop