行人检测数据集:Caltech Pedestrians; TUD-Brussels; INRIAPerson
http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/
Caltech Pedestrians contains a vast number of pedestrians—the training set consists of 192k (= 192000)
pedestrian bounding boxes and the testing set of 155k bounding boxes, with 2300 unique pedestrians on 350k
frames. Evaluation happens on every 30th frame. The dataset is difficult for several reasons. On the one hand it
contains many small pedestrians and has realistic occlusion frequency. On the other hand the image quality is
lacking, including blur as well as visible JPEG artifacts (blocks, ringing, quantization) which induce phantom
gradients. These hurt the extraction of both gradient and flow features. For our evaluation we use the model trained
on TUD-MotionPairs[27] (see below), and test on the Caltech training set. Some results for this setting—-train
on external data, test on the Caltech training set—have been published on the same website1 as the database, and
we got results for additional algorithms directly from PiotrDoll´ar for comparison. We will show that our enhanced
detector using HOG, motion, and CSS outperforms all previously evaluated algorithms by a large margin, often by
10% or more.
TUD-Brussels contains 1326 annotated pedestrians in508 image pairs of 640 480 pixels recorded from a car
moving through an inner city district. It contains pedestrianson various scales and from various viewpoints. It
comes with a training set (TUD-MotionPairs) of 1776 annotated pedestrians seen from multiple viewpoints taken from
a handheld camera in a pedestrian zone, with a negative dataset of 192 images partially taken from the same camera and partially from a moving car. This training set is used for all experiments except for those on INRIAPerson
(where the corresponding training set is used).