在SpringBoot中编写MapReduce程序,并提交任务到集群,我实在本地调试的,一开始使用的是IntelliJ IDEA 2019.3 x64的右侧的Maven直接package直接打包,结果报错如下:
Error: java.lang.RuntimeException: readObject can't find class
at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:136)
at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readFields(TaggedInputSplit.java:122)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:754)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class zut.edu.mapreduce.EmpMapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2171)
at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:134)
... 11 more
找不到自定义的Mapper类,使用idea的Build打包再次指向打包路径,问题就解决了,这是由于Maven的打包idea的Build打包产生的jar包结构不一样造成的。
ieda的Build打包如下:
1、点击左上角的File -> Project Structure -> Artifacts
点击 + 号 创建一个Empty。
选择 Module Output
点击 Apply 点击OK。点击工具栏的Build
等编译成功后在目录下会出现hadoop.jar包
代码的Configuration设置mapreduce.job.jar