学习hadoop需要查看源代码,在这里记录eclipse导入源码的过程和问题
<mirrors>
<mirror>
<id>nexus-osc</id>
<mirrorOf>*</mirrorOf>
<name>Nexus osc</name>
<url>http://maven.oschina.net/content/groups/public/</url>
</mirror>
</mirrors>
导入eclipse
下载源代码代码 解压 (注意解压路径名不要过长)进入hadoop-maven-plugins目录 执行
mvn install
返回代码根目录 执行mvn eclipse:eclipse –DskipTests
进入eclipse window 首选项 maven usersetting
选择配置文件D:\apache-maven-3.3.3\conf\settings.xml选择file import from existing maven project ->选择源码目录
错误处理
导入之后出现了100多个问题 现在按类别处理一下:
1、org.apache.hadoop.io.serializer.avro.TestAvroSerialization
下载 avro-tools-1.7.4.jar
进入目录\hadoop-common-project\hadoop-common\src\test\avro
执行java -jar 下载目录/avro-tools-1.7.4.jar compile schema avroRecord.avsc ..\java(输出目录)
2、protobuf错误
进入目录\hadoop-common-project\hadoop-common\src\test\proto执行protoc --java_out=../java *.
3、Maven Project Build Lifecycle Mapping Proble Plugin execution not covered by lifecycle configuration: org.apache.hadoop:hadoop-maven-plugins:2.6.0:protoc
引用1
这个问题是由于m2e插件 在读取pom时识别标签时出错,插件对一些pom标签(pluginManagement/>)的定义格式有要求(好蛋疼)。。。。
在问题窗口选中问题 Ctrl+1 quickfix
选择Permanently mark goal generate in pom.xml as ignored in Eclipse build
在弹出窗口选择最下层目录的文件夹 选择
全部处理之后 问题 提示
Projct configuration is not up-to-date with pom.xml. Run project configuration update
选择 quickfix ->Update project configuration->OK
错误提示消失 这时可以比较pom.xml的先后改动4、hadoop-streaming中build path错误->Java Build Path->Source:
删除…hadoop-yarn-server-resourcemanager/conf
Link Source:源码根目录/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf,再随便起个名字;
下一步 inclusion patterns:capacity-scheduler.xml;
exclusion patters:**/*.java