1、同版本的集群或集群内数据迁移(直接运行以下命令几乎不会有错误出现)
hadoop distcp hdfs://namenodeip:9000/user/root hdfs://namenodeip:9000/user/root
2、不同版本的集群数据迁移
hadoop distcp hftp://namenodeip1:50070/user/root hdfs://namenodeip2:9000/user/root
此命令与集群间的迁移不同,采用http协议,http协议的默认端口50070,即hftp://namenode1:50070
如果新集群与老集群之间不是所有集群都可以ssh无密码登录,则会报错,如下
15/01/14 10:30:10 INFO mapreduce.Job: map 50% reduce 0%
15/01/14 10:30:10 INFO mapreduce.Job: Task Id : attempt_1421143152073_0009_m_000001_0, Status : FAILED
Error: java.io.IOException: File copy failed: hftp://192.168.80.31:50070/user/wp/test.txt --> hdfs://192.168.210.10:8020/user/wp1/wp/test.txt
at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoo