使用svn,从sourceforget.net 上checkout 项目 https://archive-crawler.svn.sourceforge.net/svnroot/archive-crawler/trunk/heritrix3
Especially if you're customizing Heritrix (as seems to be the case from
setting up a dev environment), you should be basing your work off of
Heritrix 3.0.0/heritrix3 trunk (aka 'H3').
H3 is the main focus of our development going forward, and its
Spring-based configuration offers easier opportunities for incremental
extension.
It's also best to work from an SVN checkout, as the working source tree
has Eclipse project-support files (.project, .classpath) as used by the
Heritrix core team.
So my suggestions would be:
- discard any prior projects
- make sure your Eclipse install includes SVN and Maven support
- create a new project, SVN->"Checkout projects from SVN", using URL
https://archive-crawler.svn.sourceforge.net/svnroot/archive-crawler/trunk/heritrix3
- attempt one Maven2 install build from that checkout, to trigger
population of your local M2_REPO with all necessary 3rd-party libraries
- if Eclipse seems not to recognize paths it should, try one or all of:
- 'refresh' menupick on project
- restarting Eclipse
- toggling the 'build automatically' or 'clean...' options
These Ubuntu-centric notes from my colleague Steve may be helpful,
though they are still explicitly only regarding H1/H2:
https://webarchive.jira.com/wiki/display/~siznax/Heritrix+in+Eclipse
If anyone can verify/update these prior guides to work with H3, bringing
a developer from ground state to a working Eclipse H3 dev project,
that'd be greatly appreciated.
Especially if you're customizing Heritrix (as seems to be the case from
setting up a dev environment), you should be basing your work off of
Heritrix 3.0.0/heritrix3 trunk (aka 'H3').
H3 is the main focus of our development going forward, and its
Spring-based configuration offers easier opportunities for incremental
extension.
It's also best to work from an SVN checkout, as the working source tree
has Eclipse project-support files (.project, .classpath) as used by the
Heritrix core team.
So my suggestions would be:
- discard any prior projects
- make sure your Eclipse install includes SVN and Maven support
- create a new project, SVN->"Checkout projects from SVN", using URL
https://archive-crawler.svn.sourceforge.net/svnroot/archive-crawler/trunk/heritrix3
- attempt one Maven2 install build from that checkout, to trigger
population of your local M2_REPO with all necessary 3rd-party libraries
- if Eclipse seems not to recognize paths it should, try one or all of:
- 'refresh' menupick on project
- restarting Eclipse
- toggling the 'build automatically' or 'clean...' options
These Ubuntu-centric notes from my colleague Steve may be helpful,
though they are still explicitly only regarding H1/H2:
https://webarchive.jira.com/wiki/display/~siznax/Heritrix+in+Eclipse
If anyone can verify/update these prior guides to work with H3, bringing
a developer from ground state to a working Eclipse H3 dev project,
that'd be greatly appreciated.