在看目前检测、识别方面的论文时,经常遇到VOC 2007 或者 VOC 2012数据集。为了对这个数据集有一个详细的了解,专门读了相关文档并将一些要点概括如下:
The PASCAL Visual Object Classes Challenge (2012)
The goal of this challenge is to recognize objects from a number of visual object classes in realistic scenes. There are twenty object classes.
There are five main tasks. We only focus on three of them: classification, detection, and segmentation.
Classification: For each of the classes, predict the presence/absence of at least one object of that class in a test image.
Detection: For each of the classes, predict the bounding boxes of each object of that class in a test image (if any).
Segmentation: For each pixel in a test image, predict the class of the object containing that pixel or "background" if the pixel does not belong to one of the twenty specified classes.
下面以图像识别为例进行详细说明
Classi cation/Detection Image Sets
For the classification and detection tasks, there are four sets of images provided:
train: Training data
val: Validation data
trainval: The union of train and val
test: Test data
Classi cation Task
For each of the twenty object classes, predict the presence/absence of at least one object of that class in a test image. The output from your system should be a real-valued confidence of the object's presence so that a precision/recall curve can be drawn. Participants may choose to tackle all, or any subset of object classes, for example “cars only” or “motorbikes and cars”.
Two competitions are defined according to the choice of training data: (i) taken from the VOC trainval data provided, or (ii) from any source excluding the VOC test data provided.
A separate text file of results should be generated for each competition (1 or 2) and each class e.g. ‘car’. Each line should contain a single identifier and the confidence output by the classifier, separated by a space, for example:
comp1_cls_test_car.txt:
...
2009_000001 0.056313
2009_000002 0.127031
2009_000009 0.287153
...
The classification task will be judged by the precision/recall curve. The principal quantitative measure used will be the average precision (AP).
这个赛事主办方提供了评估性能的函数,我们只需要按照要求输出文本即可,评估可以直接调用赛事主办方的API。