做爬虫,需要定时开启一个Timer去爬取网页。每次大概爬取80条记录,需要花费1.5个小时左右,爬了一天,发现程序并没有按我的意志间断执行,
爬虫的间隔时间
Timer timer=new Timer();
timer.schedule(new UpdateGoogleTimetask(), starttime, period);//period是30min=30*60*60ms
后来写了个例子测了一下,为什么没有间断执行,其主要原因是period设置的时间太短了。schedule的执行顺序是这样的:
(1)如果现在时间tnow>starttime,马上开始执行第一次,假设执行一次任务需要twork秒,即结束的时间点是tnow+twork。如果tnow+twork>tnow+period,则马上第二次执行任务。也就是说schedule执行的任务时间点,都是由上一次任务开始时间+period设定的,
package util;
import java.util.Timer;
import java.util.TimerTask;
public class TimerTest {
/**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
Timer t = new Timer();
t.schedule(new MyTimerTask1(), DateTool.parseDate("2014-1-9 20:50:30"),7000);
// t.scheduleAtFixedRate(new MyTimerTask1(),
// DateTool.parseDate("2014-1-9 20:27:30"), 6000);
}
}
package util;
import java.util.TimerTask;
public class MyTimerTask1 extends TimerTask {
@Override
public void run() {
// TODO Auto-generated method stub
System.out.println("timertask is begin"+DateTool.getNowTime());
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("timertask is over"+DateTool.getNowTime());
}
}
结果是:
timertask is beginThu Jan 09 20:50:00 CST 2014
timertask is overThu Jan 09 20:50:05 CST 2014
timertask is beginThu Jan 09 20:50:07 CST 2014
timertask is overThu Jan 09 20:50:12 CST 2014
timertask is beginThu Jan 09 20:50:14 CST 2014
timertask is overThu Jan 09 20:50:19 CST 2014
timertask is beginThu Jan 09 20:50:21 CST 2014
timertask is overThu Jan 09 20:50:26 CST 2014
当修改为:
t.schedule(new MyTimerTask1(), DateTool.parseDate("2014-1-9 20:50:30"),3000);
结果为:
timertask is beginThu Jan 09 20:55:09 CST 2014
timertask is overThu Jan 09 20:55:14 CST 2014
timertask is beginThu Jan 09 20:55:14 CST 2014
timertask is overThu Jan 09 20:55:19 CST 2014
timertask is beginThu Jan 09 20:55:19 CST 2014
timertask is overThu Jan 09 20:55:24 CST 2014
2.调用scheduleAtFixedRate,如果period<任务执行一次的时间,那么他就会在上一次执行完后,立即进行下一次执行。
t.scheduleAtFixedRate(new MyTimerTask1(), DateTool.parseDate("2014-1-9 21:04:05"), 3000);
结果:
timertask is beginThu Jan 09 21:05:27 CST 2014
timertask is overThu Jan 09 21:05:32 CST 2014
timertask is beginThu Jan 09 21:05:32 CST 2014
timertask is overThu Jan 09 21:05:37 CST 2014
timertask is beginThu Jan 09 21:05:37 CST 2014
timertask is overThu Jan 09 21:05:42 CST 2014
如果period>任务执行一次的时间,并且第一次执行的时间大于startTime,那么timer会严格按照第一次执行的时间+period来确定以后的每一次执行的时间。
t.scheduleAtFixedRate(new MyTimerTask1(), DateTool.parseDate("2014-1-9 21:14:05"), 7000);
结果:
timertask is beginThu Jan 09 21:14:00 CST 2014
timertask is overThu Jan 09 21:14:05 CST 2014
timertask is beginThu Jan 09 21:14:07 CST 2014
timertask is overThu Jan 09 21:14:12 CST 2014
timertask is beginThu Jan 09 21:14:14 CST 2014
timertask is overThu Jan 09 21:14:19 CST 2014
timertask is beginThu Jan 09 21:14:21 CST 2014
timertask is overThu Jan 09 21:14:26 CST 2014