原著未明,仅为收藏,
http://wiki.apache.org/nutch/
FrontPagePlease contribute your knowledge about Nutch here!
General Information
FrontPagePlease contribute your knowledge about Nutch here!
General Information
Nutch Website
- Features
- PublicServers running Nutch
- Presentations on Nutch
- Press Articles
- Evaluations of Search Quality
- Help Wanted organizations hiring Nutch expertise
- Commercial Support and developers for hire
- Mailing Lists
- AcademicArticles that deal with Nutch
- DownloadingNutch
- HardwareRequirements
Tutorial -- A Step-by-Step guide to getting Nutch up and running.
- NutchTutorial on the wiki
- Nutch - The Java Search Engine (Builds on the basic tutorials. Includes index maintenance scripts)
- Nutch Hadoop Tutorial - How to setup Nutch and Hadoop over a cluster of machines
- Automating Fetches with Python - How to automatic the Nutch fetching process using Python
- FAQ
- Commandline options for 0.7.x
- Commandline options for version 0.8
- OverviewDeploymentConfigs
- NutchConfigurationFiles
- GettingNutchRunningWithUtf8 - For support of non-ASCII characters (Chinese, Japanese and Korean).
- GettingNutchRunningWithResin - Resin is a JSP/Servlet/EJB application server (alternative to tomcat).
- GettingNutchRunningWithJetty
- GettingNutchRunningWithUbuntu
- GettingNutchRunningWithWindows
- GettingNutchRunningWithMacOsx
- GettingNutchRunningWithRedHatApplicationServer
- GettingNutchRunningWithDebian
- GettingNutchRunningWithSocksProxy
- ErrorMessages -- What they mean and suggestions for getting rid of them.
- SimpleMapReduceTutorial
- SetupProxyForNutch - using Tinyproxy on Ubuntu
- CreateNewFilter - for example to add a category metadata to your index and be able to search for it
- UpgradeFrom07To08
- Upgrading_from_0.8.x_to_0.9
- RunNutchInEclipse
- IntranetRecrawl - script to recrawl a crawl
- MergeCrawl - script to merge 2 (or more) crawls
- SearchOverMultipleIndexes - configuring nutch to enable searching over multiple indexes
- CrossPlatformNutchScripts
- MonitoringNutchCrawls - techniques for keeping an eye on a nutch crawl's progress.
- Becoming a Nutch Developer - Start developing and contributing to Nutch.
- PluginCentral -- How to write your own plugins and use other people's.
- InternalDocumentation -- How Nutch works.
JavaDocs -- The JavaDocs for Nutch.
Nutch Version Control
- MultiLingualSupport - In development.
- FixingOpicScoring - In planning.
- HowToContribute
- TaskList -- Tasks for Nutch developers.
- Development -- More tasks for Nutch developers.
- Committer's_Rules -- Committers should follow these guidelines when deciding, which branch to use for committing the patches and when to commit.
- Release_HOWTO
- Website Update HOWTO
- Image Search Design
- NutchOSGi
- StrategicGoals
- IndexStructure
- Getting Started
Doug's Weblog -- He's the one who originally wrote Lucene and Nutch.
Stefan's Nutch Documentation
Frutch Wiki -- French Nutch Wiki
- The
Old Wiki
- Search_Theory Search Theory & White Papers
Tutorial Hadoop+Nutch 0.8 night build Roberto Navoni 24-07-06
FooFactory Nutch and Hadoop related posts