you would like to save the crawled files in a file/directory format instead of saving them in WARC files.
First, create a job with a single seed, http://foo.org/bar/. Configure the warcWriter bean so that its class is org.archive.modules.writer.MirrorWriterProcessor. This Processor will store files in a directory structure that matches the crawled URIs. The files will be stored in the crawl job's mirror directory.
First, create a job with a single seed, http://foo.org/bar/. Configure the warcWriter bean so that its class is org.archive.modules.writer.MirrorWriterProcessor. This Processor will store files in a directory structure that matches the crawled URIs. The files will be stored in the crawl job's mirror directory.