长长的报错信息如下:
13:39:15.345 [main] WARN c.v.vscrawler.core.event.EventLoop - 程序已停止
13:39:15.376 [main] INFO c.v.v.core.config.DirectoryWatcher - 注册事件:ENTRY_MODIFY
13:39:15.376 [main] INFO c.v.v.core.config.DirectoryWatcher - 注册事件:ENTRY_DELETE
13:39:15.376 [main] INFO c.v.v.core.config.DirectoryWatcher - 监控目录:E:\ideaspace\spider\Demo\target\classes
13:39:15.376 [main] INFO c.v.v.c.seed.BerkeleyDBSeedManager - vsCrawler配置工作目录:classpath:work
13:39:15.392 [main] INFO c.v.v.c.seed.BerkeleyDBSeedManager - vsCrawler实际工作目录:E:\ideaspace\spider\Demo\target\classes\work
13:39:15.830 [main] INFO c.v.v.core.seed.LocalFileSeedSource - 没有配置初始种子
13:39:15.830 [main] INFO c.v.v.c.seed.BerkeleyDBSeedManager - import new init seeds:0
################################################
############## VSCrawler ##############
############## 0.0.1 ##############
############## 你有一个有意思的灵魂 ##############
################################################
############## virjar ##############
################################################
13:39:16.111 [VSCrawler-Dispatch] INFO com.virjar.vscrawler.core.VSCrawler - Spider started!
Exception in thread "VSCrawlerWorker-thread-1" Exception in thread "VSCrawlerWorker-thread-2" Exception in thread "VSCrawlerWorker-thread-6" Exception in thread "VSCrawlerWorker-thread-3" Exception in thread "createNewSession" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
at org.apache.http.impl.client.DefaultRedirectStrategy.<init>(DefaultRedirectStrategy.java:76)
at org.apache.http.impl.client.DefaultRedirectStrategy.<clinit>(DefaultRedirectStrategy.java:84)
at com.virjar.vscrawler.core.net.DefaultHttpClientGenerator.gen(DefaultHttpClientGenerator.java:22)
at com.virjar.vscrawler.core.net.session.CrawlerSession.<init>(CrawlerSession.java:69)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.createNewSession(CrawlerSessionPool.java:126)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.borrowOne(CrawlerSessionPool.java:157)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.processSeed(VSCrawler.java:234)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.run(VSCrawler.java:222)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 11 more
java.lang.NoClassDefFoundError: Could not initialize class org.apache.http.impl.client.LaxRedirectStrategy
at com.virjar.vscrawler.core.net.DefaultHttpClientGenerator.gen(DefaultHttpClientGenerator.java:22)
at com.virjar.vscrawler.core.net.session.CrawlerSession.<init>(CrawlerSession.java:69)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.createNewSession(CrawlerSessionPool.java:126)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.borrowOne(CrawlerSessionPool.java:157)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.processSeed(VSCrawler.java:234)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.run(VSCrawler.java:222)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "VSCrawlerWorker-thread-4" java.lang.NoClassDefFoundError: Could not initialize class org.apache.http.impl.client.LaxRedirectStrategy
at com.virjar.vscrawler.core.net.DefaultHttpClientGenerator.gen(DefaultHttpClientGenerator.java:22)
at com.virjar.vscrawler.core.net.session.CrawlerSession.<init>(CrawlerSession.java:69)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.createNewSession(CrawlerSessionPool.java:126)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.borrowOne(CrawlerSessionPool.java:157)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.processSeed(VSCrawler.java:234)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.run(VSCrawler.java:222)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
java.lang.NoClassDefFoundError: Could not initialize class org.apache.http.impl.client.LaxRedirectStrategy
at com.virjar.vscrawler.core.net.DefaultHttpClientGenerator.gen(DefaultHttpClientGenerator.java:22)
at com.virjar.vscrawler.core.net.session.CrawlerSession.<init>(CrawlerSession.java:69)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.createNewSession(CrawlerSessionPool.java:126)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.borrowOne(CrawlerSessionPool.java:157)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.processSeed(VSCrawler.java:234)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.run(VSCrawler.java:222)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "VSCrawlerWorker-thread-5" java.lang.NoClassDefFoundError: Could not initialize class org.apache.http.impl.client.LaxRedirectStrategy
at com.virjar.vscrawler.core.net.DefaultHttpClientGenerator.gen(DefaultHttpClientGenerator.java:22)
at com.virjar.vscrawler.core.net.session.CrawlerSession.<init>(CrawlerSession.java:69)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.createNewSession(CrawlerSessionPool.java:126)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.borrowOne(CrawlerSessionPool.java:157)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.processSeed(VSCrawler.java:234)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.run(VSCrawler.java:222)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
java.lang.NoClassDefFoundError: Could not initialize class org.apache.http.impl.client.LaxRedirectStrategy
at com.virjar.vscrawler.core.net.DefaultHttpClientGenerator.gen(DefaultHttpClientGenerator.java:22)
at com.virjar.vscrawler.core.net.session.CrawlerSession.<init>(CrawlerSession.java:69)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.createNewSession(CrawlerSessionPool.java:126)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.borrowOne(CrawlerSessionPool.java:157)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.processSeed(VSCrawler.java:234)
at com.virjar.vscrawler.core.VSCrawler$SeedProcessTask.run(VSCrawler.java:222)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
java.lang.NoClassDefFoundError: Could not initialize class org.apache.http.impl.client.LaxRedirectStrategy
at com.virjar.vscrawler.core.net.DefaultHttpClientGenerator.gen(DefaultHttpClientGenerator.java:22)
at com.virjar.vscrawler.core.net.session.CrawlerSession.<init>(CrawlerSession.java:69)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.createNewSession(CrawlerSessionPool.java:126)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool.access$600(CrawlerSessionPool.java:28)
at com.virjar.vscrawler.core.net.session.CrawlerSessionPool$CreateSessionThread.run(CrawlerSessionPool.java:287)
尝试停止爬虫
江城子 . 程序员之歌
13:39:31.191 [vsCrawlerEventLoop] INFO com.virjar.vscrawler.core.VSCrawler - 爬虫停止,发送爬虫停止事件消息:com.virjar.vscrawler.event.systemevent.CrawlerEndEvent
十年生死两茫茫,写程序,到天亮。
13:39:31.191 [VSCrawler-Dispatch] WARN com.virjar.vscrawler.core.VSCrawler - 爬虫线程休眠被打断
千行代码,Bug何处藏。
13:39:31.191 [vsCrawlerEventLoop] INFO c.v.vscrawler.core.event.EventLoop - 收到爬虫结束消息,停止事件循环,未处理将被忽略,当前待处理事件个数:2
13:39:31.191 [vsCrawlerEventLoop] INFO c.v.v.c.seed.BerkeleyDBSeedManager - 收到爬虫结束消息,开始关闭资源
13:39:31.191 [vsCrawlerEventLoop] INFO c.v.v.c.seed.BerkeleyDBSeedManager - 拒绝抓取结果入库...
纵使上线又怎样,朝令改,夕断肠。
13:39:31.191 [vsCrawlerEventLoop] INFO c.v.v.c.seed.BerkeleyDBSeedManager - 缓存中未分发数据重新入库,正在执行的爬虫任务,不等待结果,重新入库...
13:39:31.191 [VSCrawler-Dispatch] INFO com.virjar.vscrawler.core.VSCrawler - 爬虫已经停止,不需要发生爬虫停止事件消息
领导每天新想法,天天改,日日忙。
13:39:31.191 [VSCrawler-Dispatch] INFO com.virjar.vscrawler.core.VSCrawler - 爬虫结束
相顾无言,惟有泪千行。
13:39:31.191 [vsCrawler-resource-clean] WARN com.virjar.vscrawler.core.VSCrawler - 爬虫被外部中断,尝试进行资源关闭等收尾工作
每晚灯火阑珊处,夜难寐,加班狂。
13:39:31.191 [vsCrawler-resource-clean] INFO com.virjar.vscrawler.core.VSCrawler - 爬虫已经停止,不需要发生爬虫停止事件消息
笔者在win10旗舰版,intellij idea 2017.2 预览版,jdk 1.8.0_131 下按照官方教程http://vscrawler.scumall.com/进行配置,试图跑Demo时候报了以上错误。
至于bug原因,是没有导入commons-logging-1.0.4包!
解决方案我放两个链接:
1.IntelliJ IDEA添加jar包
2.commons-logging-1.0.4.jar下载地址