行人检测论文 Integral Channel Features(上)

 Piotr Dollar 是行人检测研究领域的执牛耳者,在深度学习大行其道的今天,学习Dollar 的论文仍然很有意义。
 在学习行人检测之前,建议先拜读作者的综述文章:
 [1]P.Dollar, C. Wojek,B. Schiele, et al. Pedestrian detection: an evaluation of the state of the art [J].IEEE Transactions。

另一种改进方法Aggrate Channel Features,我译作聚合通道特征。两者大同小异,以下分别简称为ICF和ACF。
ACF代码的matlab版本已经由作者提供在github上,地址如下:
http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/
以上代码要求先配置opencv和作者的视觉工具箱,工具箱地址如下:
P. Dollár. Piotr’s Computer Vision Matlab T
http://vision.ucsd.edu/~pdollar/toolbox/doc/
里面分为:channels,classify,detector,filters,images,matlab,viedeos,结合作者的这篇论文好好研究,定有收获。
注:
(1)下载数据要放在适当位置。
(2)画roc曲线时必须把if(0)改为if(1).

ICF是ACF的先驱,那么什么是积分通道特征,看看原文的三句话:

(1)  multiple registered image channels are computed using linear and non-linear transformations of the input image

(2)features are extracted from each channel using sums over local rectangular regions. 

(3)We refer to such features as integral channel features

言而总之,类似于haar-like计算矩形块内的像素差分,ICF先对图像进行各种线性或非线性变化,然后计算矩形块内的像素和。如下图:

这里写图片描述

积分通道特征选用10 个通道:6 个方向的梯度直方图、3 个LUV 颜色通道和1 梯度幅值。
这些通道可以高效计算并且捕获输入图像不同的信息,计算得到10 个通道后,分别在这10 通道内随机选择矩形区域和大小,求取其内部所有像素的像素
值之和,最终随机选取30000 个矩形区域组成积分通道特征池。
灰度图像中,像素点(x,y)的梯度幅值和方向分别为:

这里写图片描述

其中,H(x,y)是像素点(x,y)的像素值。梯度直方图是一个加权直方图,它的bin索引通过梯度的方向来计算,
而它的权值则通过梯度的幅值
计算。梯度直方图通道计算公式为:

这里写图片描述
式 3 中,G(x, y)和Q(x, y)分别代表图像(x, y) 像素处的梯度幅值和量化的梯度方向,L 是一个指示函数,thatea 为梯度方向arfa(x, y)的量化范围,本文选用 6 个方向的梯度直方图通道,因此thatea 取值范围分别为0-30 度、30-60 度、60-90 度、90-120 度、120-150 度、150-180 度。
6个方向的梯度直方图类似于Dalal使用的Histogram of Oriented Gradient,HOG+SVM进行行人检测的方法是法国研究人员Dalal在2005的CVPR上提出的,而如今虽然有很多行人检测算法不断提出,但基本都是以HOG+SVM的思路为主。ICF里梯度直方图特征也对分类贡献最大,和LUV 颜色通道以及梯度幅值单独做检测时检测率最高达到87.2%,如图:
这里写图片描述

提取好特征,链接并归一化,用soft cascade进行分类,检测部分4步:

(1)1000弱分类器,每次计算3个值(featureId,thresholdValue,directionValue)对应于一个3000行矩阵。

(2)计算每个矩形积分通道特征记为featureValueSat

 if( (featureValueSat - thresholdValueSat) * directionValueSat >=0 )
            weakClass =  1;
        else
            weakClass = -1;
     confidence = confidence + weakClass*alpha(classifierId);     

(3) if(confidence < magicThreshold) 如果累计到小与-3就不要循环1000个 弱分类器提前止,magicThreshold=-3即soft cascade的阈值
(4)NMS非极大值抑制,以后介绍,它是将重叠的检测窗口去重,保留置信度最大的检测窗口。

整个检测流程如下图:
这里写图片描述

 检测部分计算积分特征部分c++代码如下:
//*************************************
    // 2. Run the soft cascade on the data
    //*************************************
    int windowCounter=0;
    double weakClass = 0;
    for (col = -1; col<(nCols - windowWidth); col++ ){    //322/((2^(1/8))^34) = 16.92  
        for (row = -1; row<(nRows - windowHeight); row++ ){  //248/((2^(1/8))^34) = 13.03  
        //Run the detector on this window
        double confidence = 0;
        for (classifierId = 0; classifierId<nClassifiers; classifierId++){   //nClassifiers=1000   7(第一次到第七分类
                                                                            //器止<-3(soft cascade阀值)
                                                                        //rows=0 col=-1 classifierId=9结束
            // Compute the value of the 3 features associated with this
            // weak classifier
            // 1. Root
            int featureId = feature(classifierId);    //1949 2503 2795(2) 2213 173
            double thresholdValue = threshold(classifierId);//2.425 0.444 1.298 4.4 1.14
            double directionValue = direction(classifierId);//1 1 -1 1
            double featureValue;

            if(featureId<nBaseFeatures){ //if feature 0<=f<=2999
                // WARNING: wrong way of accessing the data: lots of
                // cache faults. The data is in consecutive columns,
                // but the memory is contiguous along the rows. I should
                // transpose the matrix to solve this problem.

                //-1 beacuse of matlab/c++ conventions
                int channelId = rectangles[featureId*nRectCols + 4] -1;//1950行最后一个减1=4 10-1=9 5 2 4 1  ...3

                int row0    = rectangles[featureId*nRectCols + 0]+row;  //23-1=22 8  8 18 12 5
                int col0    = rectangles[featureId*nRectCols + 1]+col;  //6-1=5   9  2 6  3  6
                int rowEnd  = rectangles[featureId*nRectCols + 2]+row;  //25-1=24 10 10 20 15 7
                int colEnd  = rectangles[featureId*nRectCols + 3]+col;  //9-1=8   9  5 8  4  7
                featureValue =                                         //应该是积分特征1+4-2-3//IC[4*249*323+9*249+25]=323974
                    + IC[channelId*(nRows+1)*(nCols+1) + (colEnd+1) *(nRows+1) + (rowEnd+1)]
                    - IC[channelId*(nRows+1)*(nCols+1) + (col0)     *(nRows+1) + (rowEnd+1)]
                    - IC[channelId*(nRows+1)*(nCols+1) + (colEnd+1) *(nRows+1) + (row0)    ]  //2.5269-3.4851-7.65216+10.1123 = 1.50194 
                    + IC[channelId*(nRows+1)*(nCols+1) + (col0)     *(nRows+1) + (row0)    ];   //1.0576 .0999 0.1141 4.75 1.024 .. 2.16
                }else{
                //3000 - 3000  = channel 0
                int channelId = featureId - nBaseFeatures;
                double outside = 0;
                double inside = 0;
                int c= col +1; //this way it will go from 0 to...
                int r= row +1; //this way it will go from 0 to...
            //  IC=Integral Channels
                outside = + IC[channelId*(nRows+1)*(nCols+1) + (12+c) *(nRows+1) + (31 +r)] //A  ??
                      - IC[channelId*(nRows+1)*(nCols+1) + ( 8+c) *(nRows+1) + (27 +r)] //B
                      + IC[channelId*(nRows+1)*(nCols+1) + ( 8+c) *(nRows+1) + ( 3 +r)] //C
                      + IC[channelId*(nRows+1)*(nCols+1) + ( 3+c) *(nRows+1) + (27 +r)] //D
                      - IC[channelId*(nRows+1)*(nCols+1) + ( 3+c) *(nRows+1) + ( 3 +r)] //E
                      - IC[channelId*(nRows+1)*(nCols+1) + (12+c) *(nRows+1) + ( 0 +r)] //F
                      - IC[channelId*(nRows+1)*(nCols+1) + ( 0+c) *(nRows+1) + (31 +r)] //G
                      + IC[channelId*(nRows+1)*(nCols+1) + ( 0+c) *(nRows+1) + ( 0 +r)];//H

                inside  = + IC[channelId*(nRows+1)*(nCols+1) + ( 8+c) *(nRows+1) + (27 +r)] //B
                      - IC[channelId*(nRows+1)*(nCols+1) + ( 8+c) *(nRows+1) + ( 3 +r)] //C
                      - IC[channelId*(nRows+1)*(nCols+1) + ( 3+c) *(nRows+1) + (27 +r)] //D
                      + IC[channelId*(nRows+1)*(nCols+1) + ( 3+c) *(nRows+1) + ( 3 +r)];//E

                 // featureValue = outside - inside; //TODO :: I added this line - Fabio
                }

                if( (featureValue - thresholdValue) * directionValue >=0 ){
                // 2. Satisfy leaf
                // I removed the +1 because the id's are in C++ style,
                // they start from 0
                int featureIdSat         = featureSat(  classifierId);   //classifierId:1000个弱分类器中的第几个从0
                double thresholdValueSat = thresholdSat(classifierId);
                double directionValueSat = directionSat(classifierId);
                double featureValueSat;

                if(featureIdSat<nBaseFeatures){ 
                    //if feature 0<=f<=2999
                    // WARNING: wrong way of accessing the data: lots of
                    // cache faults. The data is in consecutive columns,
                    // but the memory is contiguous along the rows.
                    // I should transpose the matrix to solve this problem.

                    //-1 beacuse of matlab/c++ conventions
                    int channelId = rectangles[featureIdSat*nRectCols + 4] -1;
                    int row0    = rectangles[featureIdSat*nRectCols + 0]+row;
                    int col0    = rectangles[featureIdSat*nRectCols + 1]+col;
                    int rowEnd  = rectangles[featureIdSat*nRectCols + 2]+row;
                    int colEnd  = rectangles[featureIdSat*nRectCols + 3]+col;

                    featureValueSat = + IC[channelId*(nRows+1)*(nCols+1) + (colEnd+1) *(nRows+1) + (rowEnd+1)]
                              - IC[channelId*(nRows+1)*(nCols+1) + (col0)     *(nRows+1) + (rowEnd+1)]
                              - IC[channelId*(nRows+1)*(nCols+1) + (colEnd+1) *(nRows+1) + (row0)    ]
                              + IC[channelId*(nRows+1)*(nCols+1) + (col0)     *(nRows+1) + (row0)    ];
                }else{
                    //3000 - 3000  = channel 0
                    int channelId = featureIdSat - nBaseFeatures ;

                    if((channelId<0)||(channelId>9)){
                    cout <<"error: negative channel\n";
                    cout << "Negative channel." << endl
                          << "Source code line: " << __FILE__
                          << " @ " << __LINE__ << endl;
                    return NULL;
                    }

                    double outside = 0;
                    double inside = 0;
                    int c= col +1; //this way it will go from 0 to...
                    int r= row +1; //this way it will go from 0 to...

                    outside = + IC[channelId*(nRows+1)*(nCols+1) + (12+c) *(nRows+1) + (31 +r)] //A
                          - IC[channelId*(nRows+1)*(nCols+1) + ( 8+c) *(nRows+1) + (27 +r)] //B
                          + IC[channelId*(nRows+1)*(nCols+1) + ( 8+c) *(nRows+1) + ( 3 +r)] //C
                          + IC[channelId*(nRows+1)*(nCols+1) + ( 3+c) *(nRows+1) + (27 +r)] //D
                          - IC[channelId*(nRows+1)*(nCols+1) + ( 3+c) *(nRows+1) + ( 3 +r)] //E
                          - IC[channelId*(nRows+1)*(nCols+1) + (12+c) *(nRows+1) + ( 0 +r)] //F
                          - IC[channelId*(nRows+1)*(nCols+1) + ( 0+c) *(nRows+1) + (31 +r)] //G
                          + IC[channelId*(nRows+1)*(nCols+1) + ( 0+c) *(nRows+1) + ( 0 +r)];//H

                    inside  = + IC[channelId*(nRows+1)*(nCols+1) + ( 8+c) *(nRows+1) + (27 +r)] //B
                          - IC[channelId*(nRows+1)*(nCols+1) + ( 8+c) *(nRows+1) + ( 3 +r)] //C
                          - IC[channelId*(nRows+1)*(nCols+1) + ( 3+c) *(nRows+1) + (27 +r)] //D
                          + IC[channelId*(nRows+1)*(nCols+1) + ( 3+c) *(nRows+1) + ( 3 +r)];//E

                    featureValueSat = outside - inside;
                }

                if( (featureValueSat - thresholdValueSat) * directionValueSat >=0 )
                    weakClass =  1;
                else
                    weakClass = -1;
  代码中计算IC特征应该是矩形块内的像素和,具体没搞明白,如各位大神理解了望不吝赐教。。。。
  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值