Nutch1.1异常信息如下:
经过搜索才知道是nutch-default.xml属性设置问题:
抛出异常前的设置:
设置如下方式后,异常信息就不见了:
Fetcher: No agents listed in 'http.agent.name' property.
Exception in thread "main" java.lang.IllegalArgumentException: Fetcher: No agents listed in 'http.agent.name' property.
at org.apache.nutch.fetcher.Fetcher.checkConfiguration(Fetcher.java:1161)
at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1067)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:133)
经过搜索才知道是nutch-default.xml属性设置问题:
抛出异常前的设置:
<property>
<name>http.agent.name</name>
<value></value>
<description>HTTP 'User-Agent' request header. MUST NOT be empty -
please set this to a single word uniquely related to your organization.
NOTE: You should also check other related properties:
http.robots.agents
http.agent.description
http.agent.url
http.agent.email
http.agent.version
and set their values appropriately.
</description>
</property>
设置如下方式后,异常信息就不见了:
<property>
<name>http.agent.name</name>
<value>HD nutch agent</value>
<description>HTTP 'User-Agent' request header. MUST NOT be empty -
please set this to a single word uniquely related to your organization.
NOTE: You should also check other related properties:
http.robots.agents
http.agent.description
http.agent.url
http.agent.email
http.agent.version
and set their values appropriately.
</description>
</property>