简介
OWOD论文中的评价过程,影响实验结果的处理,这里主要记录源代码中提供的评价内容。熟悉下OWOD的评价过程。
代码
在OWOD中,默认使用的是VOC格式的evaluator,所以我们的关注点主要在 voc 评价器上。
1 detectron2\engine\defaults.py
找到当前方法中的 DefaultTrainer 类的 test 方法,可发现关键在于 inference_on_dataset 方法,其中evaluator为VOCEvaluator
@classmethod
def test(cls, cfg, model, evaluators=None):
"""
Args:
cfg (CfgNode):
model (nn.Module):
evaluators (list[DatasetEvaluator] or None): if None, will call
:meth:`build_evaluator`. Otherwise, must have the same length as
`cfg.DATASETS.TEST`.
Returns:
dict: a dict of result metrics
"""
logger = logging.getLogger(__name__)
if isinstance(evaluators, DatasetEvaluator):
evaluators = [evaluators]
if evaluators is not None:
assert len(cfg.DATASETS.TEST) == len(evaluators), "{} != {}".format(
len(cfg.DATASETS.TEST), len(evaluators)
)
results = OrderedDict()
for idx, dataset_name in enumerate(cfg.DATASETS.TEST):
data_loader = cls.build_test_loader(cfg, dataset_name)
# When evaluators are passed in as arguments,
# implicitly assume that evaluators can be created before data_loader.
if evaluators is not None:
evaluator = evaluators[idx]
else:
try:
evaluator = cls.build_evaluator(cfg, dataset_name)
except NotImplementedError:
logger.warn(
"No evaluator found. Use `DefaultTrainer.test(evaluators=)`, "
"or implement its `build_evaluator` method."
)
results[dataset_name] = {}
continue
results_i = inference_on_dataset(model, data_loader, evaluator)
results[dataset_name] = results_i
if comm.is_main_process():
assert isinstance(
results_i, dict
), "Evaluator must return a dict on the main process. Got {} instead.".format(
results_i
)
logger.info("Evaluation results for {} in csv format:".format(dataset_name))
print_csv_format(results_i)
if len(results) == 1:
results = list(results.values())[0]
return results
2 打开detectron2\evaluation\evaluator.py
该方法中,DatasetEvaluator类为所有数据集验证过程的父类。
整体上在进行inference_on_dataset过程中,主要集中在 evaluator.process 和 evaluator.evaluate两个方法,所以在VOCEvaluator 中找到这两个方法,进行详细阅读和分析。
def inference_on_dataset(model, data_loader, evaluator):
"""
Run model on the data_loader and evaluate the metrics with evaluator.
Also benchmark the inference speed of `model.forward` accurately.
The model will be used in eval mode.
Args:
model (nn.Module): a module which accepts an object from
`data_loader` and returns some outputs. It will be tempora