系统 ubuntu 18.04
Nutch 2.4
Ant 1.10
JDK 1.8
- Could not load definitions from resource org/sonar/ant/antlib.xml. It could not be found.
在用ant编译Nutch时出现上述情况,显示缺少相关的包。
1.下载sonar-ant-task-2.1.jar,并拷贝到nutch解压目录的lib文件夹下
2.修改nutch文件夹下的build.xml文件,引入上面的jar包
<taskdef uri="antlib:org.sonar.ant" resource="org/sonar/ant/antlib.xml">
<classpath path="${ant.library.dir}" />
<classpath path="${mysql.library.dir}" />
<classpath><fileset dir="lib/" includes="sonar*.jar" /></classpath>
</taskdef>
- JDK 版本过高 或过低
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by
org.apache.hadoop.security.authentication.util.KerberosUtil
(file:/C:/cygwin64/home/apache-nutch-1.15/lib/hadoop-auth-2.7.4.jar) to
method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of
org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal
reflective access operations
WARNING: All illegal access operations will be denied in a future
release
建议使用JDK 1.8
- Maven 仓库加载失败,停在下面语句,使用阿里镜像仓库
![0a16a7ec835f0af14b222ef7cea75a46.png](https://i-blog.csdnimg.cn/blog_migrate/e6822232053ef39d66d3c6af0a069ecf.png)
编辑 ivy/ivysettings.xml 文件,替换较快的 ivy 仓库:
[devalone@nutch apache-nutch-1.14]$ vi ivy/ivysettings.xml
Nutch 的 ivysettings.xml 配置了 3 个 maven 模块仓库:
<property name="Nexus Repository Manager"
value="Central Repository:"
override="false"/>
<property name="repo.maven.org"
value="http://repo1.maven.org/maven2/"
override="false"/>
<property name="Nexus Repository Manager"
value="Index of /repositories/snapshots"
override="false"/>
将这 3 个仓库替换为较快的仓库,
示例:Nexus Repository Manager 将这 3 个仓库替换为较快的仓库,示例:
http://maven.aliyun.com/nexus/content/groups/public/
http://maven.aliyun.com/nexus/content/repositories/central/
http://maven.aliyun.com/nexus/content/repositories/apache-snapshots/
- nutch 要求在运行前至少要配置 http.agent.name 属性。因此打开 conf/nutch-site.xml 文件,配置类似如下的内容:
[devalone@nutch nutch]$ vi conf/nutch-site.xml
<property>
<name>http.agent.name</name>
<value>My Nutch Spider</value>
</property>
该属性值随便设置一个合理的字符串,只要不是空串或空白串就好。