We start with a set
of training images (here consecutive image pairs so that flow can be used) in which all
of the positive training windows (ones containing people) have been manually marked.
A fixed set of initial negative training windows was selected by randomly sampling the
negative images. A preliminary classifier is trained on the marked positives and initial
negatives, and this is used to search the complete set of negative images exhaustively
for false alarms. As many of these “hard negatives” as will fit into the available RAM
are selected randomly and added to the training set, and the final classifier is trained.
Each classifier thus has its own set of hard negatives. This retraining procedure sig-
nificantly increases the performance of every detector that we have tested. Additional
rounds of search for hard negatives make little difference, so are not used. In most of
the experiments below the RAM is limited to 1.5 GB, so the larger the descriptor vector,
the smaller the number of hard examples that can be included. We think that this is fair
as memory is typically the main resource limitation during training.
of training images (here consecutive image pairs so that flow can be used) in which all
of the positive training windows (ones containing people) have been manually marked.
A fixed set of initial negative training windows was selected by randomly sampling the
negative images. A preliminary classifier is trained on the marked positives and initial
negatives, and this is used to search the complete set of negative images exhaustively
for false alarms. As many of these “hard negatives” as will fit into the available RAM
are selected randomly and added to the training set, and the final classifier is trained.
Each classifier thus has its own set of hard negatives. This retraining procedure sig-
nificantly increases the performance of every detector that we have tested. Additional
rounds of search for hard negatives make little difference, so are not used. In most of
the experiments below the RAM is limited to 1.5 GB, so the larger the descriptor vector,
the smaller the number of hard examples that can be included. We think that this is fair
as memory is typically the main resource limitation during training.
用正负样本进行训练,得到最初的分类器,之后用负样本【硬负样本】进行训练,降低误报率。
每一个分类器都有自己的硬负样本【只要进行一次负样本的训练降低误报率就好了,因为Additional
rounds of search for hard negatives make little difference】。
在内存一定的情况下,硬负样本和描述符维数之间需要取舍...
总体思路:
1、提取正负样本hof特征
2、投入svm分类器训练,得到model
3、由model生成检测子
4、利用检测子检测负样本,得到hardexample
5、提取hardexample的hof特征并结合第一步中的特征一起投入训练,得到最终检测子。