用 Sqoop 从 MySQL 向 HDFS 导入数据时时常出现某些计算任务失败,导致导入数据随机缺失一部分,并不是每次都确实同样的数据,且如果不仔细看输出日志,很容易就忽略过去了,有如下报错:
2022-09-26 14:27:09,940 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:44697. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0
2022-09-26 14:29:04,326 INFO mapreduce.ImportJobBase: The MapReduce job has already been retired. Performance
2022-09-26 14:29:04,326 INFO mapreduce.ImportJobBase: counters are unavailable. To get this information,
2022-09-26 14:29:04,326 INFO mapreduce.ImportJobBase: you will need to enable the completed job store on
2022-09-26 14:29:04,327 INFO mapreduce.ImportJobBase: the jobtracker with:
2022-09-26 14:29:04,327 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.active = true
2022-09-26 14:29:04,327 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.hours = 1
2022-09-26 14:29:04,327 INFO mapreduce.ImportJobBase: A jobtracker restart is required for these settings
2022-09-26 14:29:04,327 INFO mapreduce.ImportJobBase: to take effect.
2022-09-26 14:29:04,327 ERROR tool.ImportTool: Error during import: Import job failed!
查了半天,也测试了提示说的那两个配置,并不是这个问题,最后发现创建 Hadoop 集群时有台机器的 hostname 设置的不对,改过来就好了,唉,搭建集群时的一个疏忽,导致后面查了半天这个问题,在这里记录下,也给有同样问题的朋友一个提示。