这是Frontier的类图,从中可以看到有几个关键类:
1。BdbMultipleWorkQueues:
它是对Berkeley DB的简单封装.在内部有一个Berkeley Database,存放所有待处理的链接.
.2。BdbWorkQueue:
代表一个链接队列,该队列中所有的链接都具有相同的键值.它实际上是通过调用BdbMultipleWorkQueues的get方法从等处理链接数据库中取得一个链接的.
.3。WorkQueueFrontier:主要实现了最核心的3个方法
此外找到了一个抽象函数createAlreadyIncluded,来记录已经访问过的URL
protected abstract UriUniqFilter createAlreadyIncluded() throws IOException;
在它的子类中实现了这个抽象函数Code
4。BdbFrontier:
继承了WorkQueueFrontier,实现了createAlreadyIncluded方法
Code
protected UriUniqFilter createAlreadyIncluded() throws IOException {
UriUniqFilter uuf;
String c = null;
try {
c = (String)getAttribute(null, ATTR_INCLUDED);
} catch (AttributeNotFoundException e) {
// Do default action if attribute not in order.
}
// TODO: avoid all this special-casing; enable some common
// constructor interface usable for all alt implemenations
//使用了BloomUriUniqFilter
if (c != null && c.equals(BloomUriUniqFilter.class.getName())) {
uuf = this.controller.isCheckpointRecover()?
deserializeAlreadySeen(BloomUriUniqFilter.class,
this.controller.getCheckpointRecover().getDirectory()):
new BloomUriUniqFilter();
} else if (c!=null && c.equals(MemFPMergeUriUniqFilter.class.getName())) {
// TODO: add checkpointing for MemFPMergeUriUniqFilter
uuf = new MemFPMergeUriUniqFilter();
} else if (c!=null && c.equals(DiskFPMergeUriUniqFilter.class.getName())) {
// TODO: add checkpointing for DiskFPMergeUriUniqFilter
uuf = new DiskFPMergeUriUniqFilter(controller.getScratchDisk());
} else {
// Assume its BdbUriUniqFilter.最后使用BdbUriUniqFilter
uuf = this.controller.isCheckpointRecover()?
deserializeAlreadySeen(BdbUriUniqFilter.class,
5。BdbUriUniqFilter:
它用来检查一个要进入等待队列的链接是否已经被抓取过.其中有个关键函数setAdd,就是此次要找的isUrlVisited的核心
//添加URL
protected boolean setAdd(CharSequence uri) {
DatabaseEntry key = new DatabaseEntry();
LongBinding.longToEntry(createKey(uri), key);
long started = 0;
OperationStatus status = null;
try {
if (logger.isLoggable(Level.INFO)) {
started = System.currentTimeMillis();
}
//添加到数据库
status = alreadySeen.putNoOverwrite(null, key, ZERO_LENGTH_ENTRY);
if (logger.isLoggable(Level.INFO)) {
aggregatedLookupTime +=
(System.currentTimeMillis() - started);
}
} catch (DatabaseException e) {
logger.severe(e.getMessage());
}
if (status == OperationStatus.SUCCESS) {//若不存在,count++
count++;
if (logger.isLoggable(Level.INFO)) {
final int logAt = 10000;
if (count > 0 && ((count % logAt) == 0)) {
logger.info("Average lookup " +
(aggregatedLookupTime / logAt) + "ms.");
aggregatedLookupTime = 0;
}
}
}
if(status == OperationStatus.KEYEXIST) { //如果存在,返回false
return false; // not added
} else {
return true;
}
}
至此isUrlVisited,Politeness,就都找到了