opencv源码解析---经典

最新推荐文章于 2024-08-31 13:00:00 发布

xiaoxixi1918

最新推荐文章于 2024-08-31 13:00:00 发布

阅读量3.6k

点赞数

分类专栏： Opecv

Opecv 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

一、网上一些参考资料

　　在博客目标检测学习_1(用opencv自带hog实现行人检测) 中已经使用了opencv自带的函数detectMultiScale()实现了对行人的检测，当然了，该算法采用的是hog算法，那么hog算法是怎样实现的呢？这一节就来简单分析一下opencv中自带 hog源码。

　　网上也有不少网友对opencv中的hog源码进行了分析，很不错，看了很有收获。比如：

　　　　http://blog.csdn.net/raocong2010/article/details/6239431

　　　　该博客对该hog算法中用到的block，cell等概念有一定的图标解释；

　　　　http://blog.csdn.net/pp5576155/article/details/7029699

　　　　该博客是转载的，里面有opencv源码的一些注释，很有帮助。

　　　　http://gz-ricky.blogbus.com/logs/85326280.html

　　　　本博客对hog描述算子长度的计算做了一定介绍。

　　　　http://hi.baidu.com/susongzhi/item/3a3c758d7ff5cbdc5e0ec172

　　　　该博客对hog中快速算法的三线插值将得很详细。

　　　　http://blog.youtueye.com/work/opencv-hog-peopledetector-trainning.html

　　　　这篇博客对hog怎样训练和检测做了一定的讲解。

二、关于源码的一些简单说明

本文不是讲解hog理论的，所以需要对hog算法有一定了解，这些可以去参考hog提出者的博士论文，写得很详细。

按照正常流程，hog行人检测分为训练过程和检测过程，训练过程主要是训练得到svm的系数。在opencv源码中直接采用训练好了的svm系数，所以训练过程源码中没有涉及到多少。

　　　首先还是对hog源码中一些固定参数来个简单说明：

　　　检测窗口大小为128*64;

　　　Block大小为16*16；

　　　Cell大小为8*8；

　　　Block在检测窗口中上下移动尺寸为8*8；

　　　1个cell的梯度直方图化成9个bin；

　　　滑动窗口在检测图片中滑动的尺寸为8*8；

　　　代码中的一个hog描述子是针对一个检测窗口而言的，所以一个检测窗口共有105=((128-16)/8+1)*((64-16)/8+1)个block；一个block中有4个cell，而一个cell的hog描述子向量的长度为9；所以检测窗口的hog向量长度=3780=105*4*9维。

三、hog训练部分流程的简单理解

虽然hog源码中很少涉及到训练部分的代码，不过了解下训练过程的流程会对整个检测过程有个整体认识。

训练过程中正样本大小统一为128*64,即检测窗口的大小；该样本图片可以包含1个或多个行人。对该图片提前的hog特征长度刚好为3780维，每一个特征对应一个正样本标签进行训练。在实际的训练过程中，我们并不是去google上收集或者拍摄刚好128*64大小且有行人的图片，而是收集包含行人的任意图片(当然了,尺寸最好比128*64大),然后手工对这些正样本进行标注，即对有行人的地方画个矩形，其实也就是存了2个顶点的坐标而已，并把这个矩形的信息存储起来；最好自己写一个程序，每读入一张图片，就把矩形区域的内容截取出来并缩放到统一尺寸128*64，这样，对处理过后的该图片进行hog特征提取就可以当做正样本了。

负样本不需要统一尺寸，只需比128*64大，且图片中不能包含任何行人。实际过程中，由于是负样本，里面没有目标信息，所以不需要人工进行标注。程序中可以对该图片随机进行截取128*64大小的图片，并提取出其hog特征作为负样本。

四、ho行人检测过程

检测过程中采用的是滑动窗口法，对应本代码中，滑动窗口法的流程如下：

由上图可以看出，检测时，会对输入图片进行尺度缩放(一般是缩小),在每一层的图像上采用固定大小的滑动窗口(128*64)滑动，没个滑动窗口都提取出hog特征，送入到svm分类器中，看该窗口中是否有目标。有则存下目标区域来，无则继续滑动。

检测过程中用到的函数为detectMultiScale(),其参数分配图如下：

五、计算检测窗口中图像的梯度

计算梯度前如果需要gamma校正的话就先进行gamma校正，所谓的gamma校正就是把原来的每个通道像素值范围从0~255变换到0~15.97(255开根号)。据作者说这样校正过后的图像计算的效果会更好，在计算梯度前不需要进行高斯滤波操作。

梯度的计算是分别计算水平梯度图和垂直梯度图，然后求幅值和相位。水平梯度卷积算子为：

　　　　垂直梯度卷积算子为：

在阅读该源码的时候，要特别注意梯度幅值和角度的存储方式。因为是对一个滑动窗口里的图像进行的，所以梯度幅值和角度按照道理来说应该都是128*64=8192维的向量。但实际过程中这2者都是用的128*64*2=16384维的向量。为什么呢？

因为这里的梯度和角度都是用到了二线插值的。每一个点的梯度角度可能是0~180度之间的任意值，而程序中将其离散化为9个bin，即每个bin占20度。所以滑动窗口中每个像素点的梯度角度如果要离散化到这9个bin中，则一般它都会有2个相邻的bin(如果恰好位于某个bin的中心，则可认为对该bin的权重为1即可)。从源码中可以看到梯度的幅值是用来计算梯度直方图时权重投票的，所以每个像素点的梯度幅值就分解到了其角度相邻的2个bin了，越近的那个bin得到的权重越大。因此幅度图像用了2个通道，每个通道都是原像素点幅度的一个分量。同理，不难理解，像素点的梯度角度也用了2个通道，每个通道中存储的是它相邻2个bin的bin序号。序号小的放在第一通道。

二线插值的示意图如下：

其中，假设那3条半径为离散化后bin的中心，红色虚线为像素点O(像素点在圆心处)的梯度方向，梯度幅值为A，该梯度方向与最近的相邻bin为bin0,这两者之间的夹角为a.这该像素点O处存储的梯度幅值第1通道为A*(1-a),第2通道为A*a;该像素点O处存储的角度第1通道为0(bin的序号为0)，第2通道为1(bin的序号为1)。

另外在计算图像的梯度图和相位图时，如果该图像时3通道的，则3通道分别取梯度值，并且取梯度最大的那个通道的值为该点的梯度幅值。

六、HOG缓存结构体

HOG缓存思想是该程序作者加快hog算法速度采用的一种内存优化技术。由于我们对每幅输入图片要进行4层扫描，分别为图像金字塔层，每层中滑动窗口，每个滑动窗口中滑动的block，每个block中的cell，其实还有每个cell中的像素点；有这么多层，每一层又是一个二维的，所以速度非常慢。作者的采用的思想是HOG缓存，即把计算得到的每个滑动窗口的数据(其实最终是每个block的hog描述子向量)都存在内存查找表中，由于滑动窗口在滑动时，很多个block都会重叠，因此重叠处计算过的block信息就可以直接从查找表中读取，这样就节省了很多时间。

在这个HOG存储结构体中，会计算滑动窗口内的hog描述子，而这又涉及到滑动窗口，block，cell直接的关系，其之间的关系可以参考下面示意图：

外面最大的为待检测的图片，对待检测的图片需要用滑动窗口进行滑动来判断窗口中是否有目标，每个滑动窗口中又有很多个重叠移动的block，每个block中还有不重叠的cell。其实该程序的作者又将每个block中的像素点对cell的贡献不同，有将每个cell分成了4个区域，即图中蓝色虚线最小的框。

那么block中不同的像素点对它的cell(默认参数为1个block有4个cell)的影响是怎样的呢？请看下面示意图。

如果所示，黑色框代表1个block，红实线隔开的为4个cell，每个cell用绿色虚线隔开的我们称之为4个区域，所以该block中共有16个区域，分别为A、B、C、…、O、P。

程序中将这16个区域分为4组：

第1组：A、D、M、P;该组内的像素点计算梯度方向直方图时只对其所在的cell有贡献。

第2组：B、C、N、O;该组内的像素点计算梯度直方图时对其所在的左右cell有贡献。

第3组：E、I、H、L;该组内的像素点计算梯度直方图时对其所在的上下cell有贡献。

第4组：F、G、J、K;该组内的像素点对其上下左右的cell计算梯度直方图时都有贡献。

那到底是怎么对cell贡献的呢？举个例子来说，E区域内的像素点对cell0和cell2有贡献。本来1个block对滑动窗口贡献的向量维数为36维，即每个cell贡献9维，其顺序分别为cell0,cell1,cell2,cell3.而E区域内的像素由于同时对cell0和cell2有贡献，所以在计算E区域内的像素梯度投票时，不仅要投向它本来的cell0，还要投向下面的cell2，即投向cell0和cell2有一个权重，该权重与该像素点所在位置与cell0，cell2中心位置的距离有关。具体的关系可以去查看源码。

该结构体变量内存分配图如下，可以增强读代码的直观性：

在读该部分源码时，需要特别注意以下几个地方：

　　　　1) 结构体BlockData中有2个变量。1个BlockData结构体是对应的一个block数据。histOfs和imgOffset.其中histOfs表示为该block对整个滑动窗口内hog描述算子的贡献那部分向量的起始位置；imgOffset为该block在滑动窗口图片中的坐标(当然是指左上角坐标)。

　　　　2) 结构体PixData中有5个变量，1个PixData结构体是对应的block中1个像素点的数据。其中gradOfs表示该点的梯度幅度在滑动窗口图片梯度幅度图中的位置坐标；qangleOfs表示该点的梯度角度在滑动窗口图片梯度角度图中的位置坐标；histOfs[]表示该像素点对1个或2个或4个cell贡献的hog描述子向量的起始位置坐标（比较抽象，需要看源码才懂）。histWeight[]表示该像素点对1个或2个或4个cell贡献的权重。gradWeight表示该点本身由于处在block中位置的不同因而对梯度直方图贡献也不同，其权值按照二维高斯分布(以block中心为二维高斯的中心)来决定。

　　　　3) 程序中的count1,cout2,cout4分别表示该block中对1个cell、2个cell、4个cell有贡献的像素点的个数。

　　　　七、其他一些函数

　　　　该程序中还有一些其它的函数。

　　　　getblock()表示的是给定block在滑动窗口的位置以及图片的hog缓存指针，来获得本次block中计算hog特征所需要的信息。

　　　　normalizeBlockHistogram()指对block获取到的hog部分描述子进行归一化，其实该归一化有2层，具体看代码。

　　　　windowsInImage()实现的功能是给定测试图片和滑动窗口移动的大小，来获得该层中水平和垂直方向上需要滑动多少个滑动窗口。

　　　　getWindow()值获得一个滑动窗口矩形。

　　　　compute()是实际上计算hog描述子的函数，在测试和训练阶段都能用到。

　　　　detect()是检测目标是用到的函数，在detectMultiScale()函数内部被调用。

八、关于HOG的初始化

Hog初始化可以采用直接赋初值；也直接从文件节点中读取(有相应的格式，好像采用的是xml文件格式)；当然我们可以读取初始值，也可以在程序中设置hog算子的初始值并写入文件，这些工作可以采用源码中的read，write，load，save等函数来完成。

九、hog源码的注释

在读源码时，由于里面用到了intel的ipp库，优化了算法的速度，所以在程序中遇到#ifdef HAVE_IPP后面的代码时，可以直接跳过不读，直接读#else后面的代码，这并不影响对原hog算法的理解。

首先来看看hog源码中用到的头文件目录图，如下：

　　　　下面是我对hog源码的一些注释，由于本人接触c++比较少，可能有些c++的语法常识也给注释起来了，还望大家能理解。另外程序中还有一些细节没有读懂，或者说是注释错了的，大家可以一起来讨论下,很多细节要在源码中才能看懂。

hog.cpp:

 
   
 
1/*M/// 2// 3// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING. 4// 5// By downloading, copying, installing or using the software you agree to this license. 6// If you do not agree to this license, do not download, install, 7// copy or use the software. 8// 9// 10// License Agreement 11// For Open Source Computer Vision Library 12// 13// Copyright (C) 2000-2008, Intel Corporation, all rights reserved. 14// Copyright (C) 2009, Willow Garage Inc., all rights reserved. 15// Third party copyrights are property of their respective owners. 16// 17// Redistribution and use in source and binary forms, with or without modification, 18// are permitted provided that the following conditions are met: 19// 20// * Redistribution's of source code must retain the above copyright notice, 21// this list of conditions and the following disclaimer. 22// 23// * Redistribution's in binary form must reproduce the above copyright notice, 24// this list of conditions and the following disclaimer in the documentation 25// and/or other materials provided with the distribution. 26// 27// * The name of the copyright holders may not be used to endorse or promote products 28// derived from this software without specific prior written permission. 29// 30// This software is provided by the copyright holders and contributors "as is" and 31// any express or implied warranties, including, but not limited to, the implied 32// warranties of merchantability and fitness for a particular purpose are disclaimed.33// In no event shall the Intel Corporation or contributors be liable for any direct, 34// indirect, incidental, special, exemplary, or consequential damages 35// (including, but not limited to, procurement of substitute goods or services; 36// loss of use, data, or profits; or business interruption) however caused 37// and on any theory of liability, whether in contract, strict liability, 38// or tort (including negligence or otherwise) arising in any way out of 39// the use of this software, even if advised of the possibility of such damage. 40// 41//M*/4243 #include 
 "precomp.hpp"44 #include <iterator>  
 45#ifdef HAVE_IPP 46 #include  
 "ipp.h"47#endif48/****************************************************************************************\ 49The code below is implementation of HOG (Histogram-of-Oriented Gradients) 50descriptor and object detection, introduced by Navneet Dalal and Bill Triggs. 5152The computed feature vectors are compatible with the53INRIA Object Detection and Localization Toolkit 54(http://pascal.inrialpes.fr/soft/olt/) 55\****************************************************************************************/5657namespace cv 58{ 5960 size_t HOGDescriptor::getDescriptorSize()  
 const61{ 62//下面2个语句是保证block中有整数个cell;保证block在窗口中能移动整数次63 CV_Assert(blockSize.width % cellSize.width ==  
 0 &&  
 64blockSize.height % cellSize.height ==  
 0); 65 CV_Assert((winSize.width - blockSize.width) % blockStride.width ==  
 0 &&  
 66 (winSize.height - blockSize.height) % blockStride.height ==  
 0 ); 67//返回的nbins是每个窗口中检测到的hog向量的维数68return (size_t)nbins*  
 69 (blockSize.width/cellSize.width)*  
 70(blockSize.height/cellSize.height)*  
 71 ((winSize.width - blockSize.width)/blockStride.width +  
 1)*  
 72((winSize.height - blockSize.height)/blockStride.height +  
 1); 73} 7475//winSigma到底是什么作用呢？76double HOGDescriptor::getWinSigma()  
 const77{ 78return winSigma >=  
 0 ? winSigma : (blockSize.width + blockSize.height)/ 
 8.; 79} 8081//svmDetector是HOGDescriptor内的一个成员变量，数据类型为向量vector。 82//用来保存hog特征用于svm分类时的系数的. 83//该函数返回为真的实际含义是什么呢？保证与hog特征长度相同，或者相差1，但为什么84//相差1也可以呢？85bool HOGDescriptor::checkDetectorSize()  
 const86{ 87 size_t detectorSize = svmDetector.size(), descriptorSize = 
  getDescriptorSize(); 88return detectorSize ==  
 0 ||  
 89 detectorSize == descriptorSize ||  
 90 detectorSize == descriptorSize +  
 1; 91} 9293voidHOGDescriptor::setSVMDetector(InputArray _svmDetector) 94{ 95//这里的convertTo函数只是将图像Mat属性更改，比如说通道数，矩阵深度等。 96//这里是将输入的svm系数矩阵全部转换成浮点型。97_svmDetector.getMat().convertTo(svmDetector, CV_32F); 98CV_Assert( checkDetectorSize() ); 99} 100101#define CV_TYPE_NAME_HOG_DESCRIPTOR "opencv-object-detector-hog"  
 102103//FileNode是opencv的core中的一个文件存储节点类，这个节点用来存储读取到的每一个文件元素。 104//一般是读取XML和YAML格式的文件 105//又因为该函数是把文件节点中的内容读取到其类的成员变量中，所以函数后面不能有关键字const106bool HOGDescriptor::read(FileNode& 
  obj)107{ 108//isMap()是用来判断这个节点是不是一个映射类型，如果是映射类型，则每个节点都与 109//一个名字对应起来。因此这里的if语句的作用就是需读取的文件node是一个映射类型110if( ! 
 obj.isMap() ) 111returnfalse; 112//中括号中的"winSize"是指返回名为winSize的一个节点，因为已经知道这些节点是mapping类型 113//也就是说都有一个对应的名字。114FileNodeIterator it = obj[ 
 "winSize"].begin(); 115//操作符>>为从节点中读入数据，这里是将it指向的节点数据依次读入winSize.width,winSize.height 116//下面的几条语句功能类似117 it >> winSize.width >> 
  winSize.height; 118 it = obj[ 
 "blockSize"].begin(); 119 it >> blockSize.width >> 
  blockSize.height; 120 it = obj[ 
 "blockStride"].begin(); 121 it >> blockStride.width >> 
  blockStride.height; 122 it = obj[ 
 "cellSize"].begin(); 123 it >> cellSize.width >> 
  cellSize.height; 124 obj[ 
 "nbins"] >> 
  nbins; 125obj[ 
 "derivAperture"] >> 
  derivAperture; 126 obj[ 
 "winSigma"] >> 
  winSigma; 127 obj[ 
 "histogramNormType"] >> 
 histogramNormType; 128 obj[ 
 "L2HysThreshold"] >> 
  L2HysThreshold; 129 obj[ 
 "gammaCorrection"] >> 
 gammaCorrection; 130 obj[ 
 "nlevels"] >> 
  nlevels; 131132//isSeq()是判断该节点内容是不是一个序列133 FileNode vecNode = obj[ 
 "SVMDetector"]; 134if( vecNode.isSeq() ) 135{ 136 vecNode >> 
  svmDetector; 137CV_Assert(checkDetectorSize()); 138} 139//上面的都读取完了后就返回读取成功标志140returntrue; 141} 142143void HOGDescriptor::write(FileStorage& fs,  
 const String& objName)  
 const144{ 145//将objName名字输入到文件fs中146if( ! 
 objName.empty() ) 147 fs << 
  objName; 148149 fs <<  
 "{" CV_TYPE_NAME_HOG_DESCRIPTOR150//下面几句依次将hog描述子内的变量输入到文件fs中，且每次输入前都输入 151//一个名字与其对应，因此这些节点是mapping类型。152 <<  
 "winSize" << 
  winSize 153 <<  
 "blockSize" << 
  blockSize 154 <<  
 "blockStride" << 
  blockStride155 <<  
 "cellSize" << 
  cellSize 156 <<  
 "nbins" << 
  nbins 157 <<  
 "derivAperture" << 
  derivAperture 158 << 
 "winSigma" << 
  getWinSigma() 159 <<  
 "histogramNormType" << 
  histogramNormType 160 <<  
 "L2HysThreshold" << 
 L2HysThreshold 161 <<  
 "gammaCorrection" << 
  gammaCorrection 162 <<  
 "nlevels" << 
  nlevels; 163if( ! 
 svmDetector.empty() ) 164//svmDetector则是直接输入序列，也有对应的名字。165 fs <<  
 "SVMDetector" <<  
 "[:"<< svmDetector <<  
 "]"; 166 fs <<  
 "}"; 167} 168169//从给定的文件中读取参数170boolHOGDescriptor::load( 
 const String& filename,  
 const String& 
  objname) 171{ 172FileStorage fs(filename, FileStorage::READ); 173//一个文件节点有很多叶子，所以一个文件节点包含了很多内容，这里当然是包含的 174//HOGDescriptor需要的各种参数了。175 FileNode obj = !objname.empty() ? 
  fs[objname] : fs.getFirstTopLevelNode(); 176return read(obj); 177} 178179//将类中的参数以文件节点的形式写入文件中。180void HOGDescriptor::save( 
 const String& filename,  
 const String& objName)  
 const181{ 182FileStorage fs(filename, FileStorage::WRITE); 183 write(fs, !objName.empty() ? 
  objName : FileStorage::getDefaultObjectName(filename)); 184} 185186//复制HOG描述子到c中187voidHOGDescriptor::copyTo(HOGDescriptor& c)  
 const188{ 189 c.winSize = 
  winSize; 190 c.blockSize = 
 blockSize; 191 c.blockStride = 
  blockStride; 192 c.cellSize = 
  cellSize; 193 c.nbins = 
  nbins; 194c.derivAperture = 
  derivAperture; 195 c.winSigma = 
  winSigma; 196 c.histogramNormType = 
  histogramNormType;197 c.L2HysThreshold = 
  L2HysThreshold; 198 c.gammaCorrection = 
  gammaCorrection; 199//vector类型也可以用等号赋值200 c.svmDetector = svmDetector; c.nlevels = 
  nlevels; } 201202//计算图像img的梯度幅度图像grad和梯度方向图像qangle. 203//paddingTL为需要在原图像img左上角扩增的尺寸，同理paddingBR 204//为需要在img图像右下角扩增的尺寸。205void HOGDescriptor::computeGradient( 
 const Mat& img, Mat& grad, Mat& 
  qangle, 206 Size paddingTL, Size paddingBR)  
 const207{ 208//该函数只能计算8位整型深度的单通道或者3通道图像.209 CV_Assert( img.type() == CV_8U || img.type() == 
  CV_8UC3 ); 210211//将图像按照输入参数进行扩充,这里不是为了计算边缘梯度而做的扩充，因为 212//为了边缘梯度而扩充是在后面的代码完成的，所以这里为什么扩充暂时还不明白。213 Size gradsize(img.cols + paddingTL.width + 
  paddingBR.width, 214 img.rows + paddingTL.height + 
 paddingBR.height); 215 grad.create(gradsize, CV_32FC2);  
 // <magnitude*(1-alpha), magnitude*alpha>216qangle.create(gradsize, CV_8UC2);  
 // [0..nbins-1] - quantized gradient orientation217Size wholeSize;218Point roiofs; 219//locateROI在此处是如果img图像是从其它父图像中某一部分得来的，那么其父图像 220//的大小尺寸就为wholeSize了，img图像左上角相对于父图像的位置点就为roiofs了。 221//对于正样本，其父图像就是img了，所以这里的wholeSize就和img.size()是一样的， 222//对应负样本，这2者不同；因为里面的关系比较不好懂，这里权且将wholesSize理解为223//img的size，所以roiofs就应当理解为Point(0, 0)了。224img.locateROI(wholeSize, roiofs); 225226int i, x, y; 227int cn = 
  img.channels(); 228229//_lut为行向量，用来作为浮点像素值的存储查找表230 Mat_< 
 float> _lut( 
 1,  
 256); 231constfloat* lut = &_lut( 
 0, 
 0); 232233//gamma校正指的是将0～256的像素值全部开根号，即范围缩小了，且变换范围都不成线性了，234if( gammaCorrection ) 235for( i =  
 0; i <  
 256; i++ 
  ) 236 _lut( 
 0,i) = std::sqrt(( 
 float)i); 237else238for( i =  
 0; i <  
 256; i++ 
  ) 239 _lut( 
 0,i) = ( 
 float)i; 240241//创建长度为gradsize.width+gradsize.height+4的整型buffer242 AutoBuffer< 
 int> mapbuf(gradsize.width + gradsize.height +  
 4); 243int* xmap = ( 
 int*)mapbuf +  
 1; 244int* ymap = xmap + gradsize.width +  
 2; 245246//言外之意思borderType就等于4了，因为opencv的源码中是如下定义的。 247//#define IPL_BORDER_REFLECT_101 4248//enum{...,BORDER_REFLECT_101=IPL_BORDER_REFLECT_101,...} 249//borderType为边界扩充后所填充像素点的方式。 250/*251Various border types, image boundaries are denoted with '|' 252253* BORDER_REPLICATE: aaaaaa|abcdefgh|hhhhhhh 254* BORDER_REFLECT: fedcba|abcdefgh|hgfedcb 255* BORDER_REFLECT_101: gfedcb|abcdefgh|gfedcba 256* BORDER_WRAP: cdefgh|abcdefgh|abcdefg 257* BORDER_CONSTANT: iiiiii|abcdefgh|iiiiiii with some specified 'i' 258*/259constint borderType = ( 
 int)BORDER_REFLECT_101; 260261for( x = - 
 1; x < gradsize.width +  
 1; x++ 
  ) 262/*int borderInterpolate(int p, int len, int borderType) 263其中参数p表示的是扩充后图像的一个坐标，相对于对应的坐标轴而言； 264len参数表示对应源图像的一个坐标轴的长度；borderType为扩充类型， 265在上面已经有过介绍. 266所以这个函数的作用是从扩充后的像素点坐标推断出源图像中对应该点 267的坐标值。 268*/269//这里的xmap和ymap实际含义是什么呢？其实xmap向量里面存的就是 270//扩充后图像第一行像素点对应与原图像img中的像素横坐标，可以看 271//出，xmap向量中有些元素的值是相同的，因为扩充图像肯定会对应 272//到原图像img中的某一位置，而img本身尺寸内的像素也会对应该位置。 273//同理，ymap向量里面存的是扩充后图像第一列像素点对应于原图想img 274//中的像素纵坐标。275 xmap[x] = borderInterpolate(x - paddingTL.width + 
  roiofs.x, 276 wholeSize.width, borderType) - 
  roiofs.x; 277for( y = - 
 1; y < gradsize.height +  
 1; y++ 
  ) 278 ymap[y] = borderInterpolate(y - paddingTL.height + 
  roiofs.y, 279wholeSize.height, borderType) - 
  roiofs.y; 280281// x- & y- derivatives for the whole row282int width = 
  gradsize.width; 283 AutoBuffer< 
 float> _dbuf(width* 
 4); 284float* dbuf = 
  _dbuf; 285//DX为水平梯度图，DY为垂直梯度图，Mag为梯度幅度图，Angle为梯度角度图 286//该构造方法的第4个参数表示矩阵Mat的数据在内存中存放的位置。由此可以 287//看出，这4幅图像在内存中是连续存储的。288 Mat Dx( 
 1, width, CV_32F, dbuf); 289 Mat Dy( 
 1, width, CV_32F, dbuf + 
  width); 290 Mat Mag( 
 1, width, CV_32F, dbuf + width* 
 2); 291 Mat Angle( 
 1, width, CV_32F, dbuf + width* 
 3); 292293int _nbins = 
  nbins; 294//angleScale==9/pi;295float angleScale = ( 
 float)(_nbins/ 
 CV_PI); 296#ifdef HAVE_IPP 297Mat lutimg(img.rows,img.cols,CV_MAKETYPE(CV_32F,cn)); 298 Mat hidxs( 
 1, width, CV_32F); 299 Ipp32f* pHidxs = (Ipp32f* 
 )hidxs.data; 300 Ipp32f* pAngles = (Ipp32f* 
 )Angle.data; 301302IppiSize roiSize; 303 roiSize.width = 
  img.cols; 304 roiSize.height = 
 img.rows; 305306for( y =  
 0; y < roiSize.height; y++ 
  ) 307{ 308const uchar* imgPtr = img.data + y* 
 img.step; 309float* imglutPtr = ( 
 float*)(lutimg.data + y* 
 lutimg.step); 310311for( x =  
 0; x < roiSize.width*cn; x++ 
  ) 312{ 313 imglutPtr[x] = 
  lut[imgPtr[x]]; 314} 315} 316317#endif318for( y = 
 0; y < gradsize.height; y++ 
  ) 319{ 320#ifdef HAVE_IPP 321constfloat* imgPtr = ( 
 float*)(lutimg.data + lutimg.step* 
 ymap[y]); 322constfloat* prevPtr = ( 
 float*)(lutimg.data + lutimg.step*ymap[y- 
 1]); 323constfloat* nextPtr = ( 
 float*)(lutimg.data + lutimg.step*ymap[y+ 
 1]); 324#else325//imgPtr在这里指的是img图像的第y行首地址；prePtr指的是img第y-1行首地址； 326//nextPtr指的是img第y+1行首地址；327const uchar* imgPtr = img.data + img.step* 
 ymap[y]; 328const uchar* prevPtr = img.data + img.step*ymap[y- 
 1]; 329const uchar* nextPtr = img.data + img.step*ymap[y+ 
 1]; 330#endif331float* gradPtr = ( 
 float* 
 )grad.ptr(y); 332 uchar* qanglePtr = (uchar* 
 )qangle.ptr(y); 333334//输入图像img为单通道图像时的计算335if( cn ==  
 1 ) 336{ 337for( x =  
 0; x < width; x++ 
  ) 338{ 339int x1 = 
  xmap[x]; 340#ifdef HAVE_IPP341 dbuf[x] = ( 
 float)(imgPtr[xmap[x+ 
 1]] - imgPtr[xmap[x- 
 1]]); 342 dbuf[width + x] = ( 
 float)(nextPtr[x1] - 
  prevPtr[x1]); 343#else344//下面2句把Dx，Dy就计算出来了，因为其对应的内存都在dbuf中345 dbuf[x] = ( 
 float)(lut[imgPtr[xmap[x+ 
 1]]] - lut[imgPtr[xmap[x- 
 1]]]); 346 dbuf[width + x] = ( 
 float)(lut[nextPtr[x1]] - 
 lut[prevPtr[x1]]); 347#endif348} 349} 350//当cn==3时，也就是输入图像为3通道图像时的处理。351else352{353for( x =  
 0; x < width; x++ 
  ) 354{ 355//x1表示第y行第x1列的地址356int x1 = xmap[x]* 
 3; 357floatdx0, dy0, dx, dy, mag0, mag; 358#ifdef HAVE_IPP 359constfloat* p2 = imgPtr + xmap[x+ 
 1]* 
 3; 360constfloat* p0 = imgPtr + xmap[x- 
 1]* 
 3; 361362 dx0 = p2[ 
 2] - p0[ 
 2]; 363 dy0 = nextPtr[x1+ 
 2] - prevPtr[x1+ 
 2];364 mag0 = dx0*dx0 + dy0* 
 dy0; 365366 dx = p2[ 
 1] - p0[ 
 1]; 367 dy = nextPtr[x1+ 
 1] - prevPtr[x1+ 
 1]; 368mag = dx*dx + dy* 
 dy; 369370if( mag0 < 
  mag ) 371{ 372 dx0 = 
  dx; 373 dy0 = 
  dy; 374 mag0 = 
  mag; 375}376377 dx = p2[ 
 0] - p0[ 
 0]; 378 dy = nextPtr[x1] - 
  prevPtr[x1]; 379 mag = dx*dx + dy* 
 dy; 380#else381//p2为第y行第x+1列的地址 382//p0为第y行第x-1列的地址383const uchar* p2 = imgPtr + xmap[x+ 
 1]* 
 3; 384constuchar* p0 = imgPtr + xmap[x- 
 1]* 
 3; 385386//计算第2通道的幅值387 dx0 = lut[p2[ 
 2]] - lut[p0[ 
 2]]; 388 dy0 = lut[nextPtr[x1+ 
 2]] - lut[prevPtr[x1+ 
 2]]; 389 mag0 = dx0*dx0 + dy0* 
 dy0; 390391//计算第1通道的幅值392 dx = lut[p2[ 
 1]] - lut[p0[ 
 1]]; 393 dy = lut[nextPtr[x1+ 
 1]] - lut[prevPtr[x1+ 
 1]]; 394 mag = dx*dx + dy* 
 dy;395396//取幅值最大的那个通道397if( mag0 < 
  mag ) 398{ 399 dx0 = 
  dx; 400 dy0 = 
  dy; 401 mag0 = 
  mag; 402} 403404//计算第0通道的幅值405 dx = lut[p2[ 
 0]] - lut[p0[ 
 0]]; 406 dy = lut[nextPtr[x1]] - 
 lut[prevPtr[x1]]; 407 mag = dx*dx + dy* 
 dy; 408#endif409//取幅值最大的那个通道410if( mag0 < 
  mag ) 411{ 412 dx0 = 
  dx; 413 dy0 = 
  dy; 414 mag0 = 
  mag; 415} 416417//最后求出水平和垂直方向上的梯度图像418 dbuf[x] = 
  dx0; 419 dbuf[x+width] = 
  dy0; 420} 421} 422#ifdef HAVE_IPP 423 ippsCartToPolar_32f(( 
 constIpp32f*)Dx.data, ( 
 const Ipp32f*)Dy.data, (Ipp32f* 
 )Mag.data, pAngles, width); 424for( x =  
 0; x < width; x++ 
  ) 425{ 426if(pAngles[x] <  
 0.f) 427 pAngles[x] += (Ipp32f)(CV_PI* 
 2.); 428} 429430ippsNormalize_32f(pAngles, pAngles, width,  
 0.5f/angleScale,  
 1.f/ 
 angleScale); 431 ippsFloor_32f(pAngles,(Ipp32f* 
 )hidxs.data,width); 432 ippsSub_32f_I((Ipp32f* 
 )hidxs.data,pAngles,width); 433ippsMul_32f_I((Ipp32f* 
 )Mag.data,pAngles,width); 434435 ippsSub_32f_I(pAngles,(Ipp32f* 
 )Mag.data,width);436 ippsRealToCplx_32f((Ipp32f*)Mag.data,pAngles,(Ipp32fc* 
 )gradPtr,width); 437#else438//cartToPolar()函数是计算2个矩阵对应元素的幅度和角度，最后一个参数为是否 439//角度使用度数表示，这里为false表示不用度数表示，即用弧度表示。 440//如果只需计算2个矩阵对应元素的幅度图像，可以采用magnitude()函数。 441//-pi/2<Angle<pi/2;442cartToPolar( Dx, Dy, Mag, Angle,  
 false ); 443#endif444for( x =  
 0; x < width; x++ 
  ) 445{ 446#ifdef HAVE_IPP 447int hidx = ( 
 int)pHidxs[x]; 448#else449//-5<angle<4450float mag = dbuf[x+width* 
 2], angle = dbuf[x+width* 
 3]*angleScale -  
 0.5f; 451//cvFloor()返回不大于参数的最大整数 452//hidx={-5,-4,-3,-2,-1,0,1,2,3,4};453int hidx = 
  cvFloor(angle); 454//0<=angle<1;angle表示的意思是与其相邻的较小的那个bin的弧度距离(即弧度差)455 angle -= 
  hidx; 456//gradPtr为grad图像的指针 457//gradPtr[x*2]表示的是与x处梯度方向相邻较小的那个bin的幅度权重； 458//gradPtr[x*2+1]表示的是与x处梯度方向相邻较大的那个bin的幅度权重459gradPtr[x* 
 2] = mag*( 
 1.f - 
  angle); 460 gradPtr[x* 
 2+ 
 1] = mag* 
 angle; 461#endif462if( hidx <  
 0 ) 463 hidx += 
  _nbins; 464elseif( hidx >= 
  _nbins ) 465 hidx -= 
  _nbins; 466 assert( (unsigned)hidx < 
 (unsigned)_nbins ); 467468 qanglePtr[x* 
 2] = 
  (uchar)hidx; 469 hidx++ 
 ; 470//-1在补码中的表示为11111111,与-1相与的话就是自己本身了； 471//0在补码中的表示为00000000,与0相与的结果就是0了.472 hidx &= hidx < _nbins ? - 
 1 :  
 0; 473 qanglePtr[x* 
 2+ 
 1] = 
  (uchar)hidx; 474} 475} 476} 477478479struct HOGCache 480{ 481struct BlockData 482{ 483 BlockData() : histOfs( 
 0), imgOffset() {} 484int histOfs; 485Point imgOffset; 486}; 487488struct PixData 489{ 490size_t gradOfs, qangleOfs; 491int histOfs[ 
 4]; 492float histWeights[ 
 4]; 493float gradWeight; 494}; 495496HOGCache(); 497 HOGCache( 
 const HOGDescriptor* 
 descriptor, 498const Mat& 
  img, Size paddingTL, Size paddingBR, 499bool useCache, Size cacheStride);500virtual ~ 
 HOGCache() {}; 501virtualvoid init( 
 const HOGDescriptor* 
  descriptor, 502const Mat& 
  img, Size paddingTL, Size paddingBR, 503bool useCache, Size cacheStride); 504505 Size windowsInImage(Size imageSize, Size winStride)  
 const; 506 Rect getWindow(Size imageSize, Size winStride,  
 int idx)  
 const; 507508constfloat* getBlock(Point pt,  
 float* 
  buf); 509virtualvoid normalizeBlockHistogram( 
 float* histogram)  
 const; 510511 vector<PixData> 
  pixData; 512 vector<BlockData> 
  blockData; 513514booluseCache; 515 vector< 
 int> 
  ymaxCached; 516Size winSize, cacheStride; 517Size nblocks, ncells; 518intblockHistogramSize; 519int count1, count2, count4; 520Point imgoffset; 521 Mat_< 
 float> 
  blockCache; 522Mat_<uchar> 
  blockCacheFlags; 523524Mat grad, qangle; 525const HOGDescriptor* 
  descriptor; 526}; 527528//默认的构造函数,不使用cache,块的直方图向量大小为0等529HOGCache::HOGCache() 530{ 531 useCache =  
 false;532 blockHistogramSize = count1 = count2 = count4 =  
 0; 533 descriptor =  
 0; 534} 535536//带参的初始化函数，采用内部的init函数进行初始化537 HOGCache::HOGCache( 
 const HOGDescriptor* 
  _descriptor, 538const Mat& 
 _img, Size _paddingTL, Size _paddingBR, 539bool _useCache, Size _cacheStride) 540{ 541init(_descriptor, _img, _paddingTL, _paddingBR, _useCache, _cacheStride); 542} 543544//HOGCache结构体的初始化函数545void HOGCache::init( 
 const HOGDescriptor* 
  _descriptor, 546const Mat& 
  _img, Size _paddingTL, Size _paddingBR, 547bool _useCache, Size _cacheStride) 548{ 549 descriptor = 
  _descriptor;550 cacheStride = 
  _cacheStride; 551 useCache = 
  _useCache; 552553//首先调用computeGradient()函数计算输入图像的权值梯度幅度图和角度量化图554 descriptor-> 
 computeGradient(_img, grad, qangle, _paddingTL, _paddingBR);555//imgoffset是Point类型，而_paddingTL是Size类型，虽然类型不同，但是2者都是 556//一个二维坐标，所以是在opencv中是允许直接赋值的。557 imgoffset = 
  _paddingTL; 558559 winSize = descriptor-> 
 winSize; 560 Size blockSize = descriptor-> 
 blockSize; 561 Size blockStride = descriptor-> 
 blockStride; 562 Size cellSize = descriptor-> 
 cellSize; 563int i, j, nbins = descriptor-> 
 nbins; 564//rawBlockSize为block中包含像素点的个数565intrawBlockSize = blockSize.width* 
 blockSize.height; 566567//nblocks为Size类型，其长和宽分别表示一个窗口中水平方向和垂直方向上block的 568//个数(需要考虑block在窗口中的移动)569 nblocks = Size((winSize.width - blockSize.width)/blockStride.width +  
 1, 570 (winSize.height - blockSize.height)/blockStride.height +  
 1);571//ncells也是Size类型，其长和宽分别表示一个block中水平方向和垂直方向容纳下 572//的cell个数573 ncells = Size(blockSize.width/cellSize.width, blockSize.height/ 
 cellSize.height); 574//blockHistogramSize表示一个block中贡献给hog描述子向量的长度575 blockHistogramSize = ncells.width*ncells.height* 
 nbins; 576577if( useCache ) 578{ 579//cacheStride= _cacheStride,即其大小是由参数传入的,表示的是窗口移动的大小 580//cacheSize长和宽表示扩充后的图像cache中，block在水平方向和垂直方向出现的个数581 Size cacheSize((grad.cols - blockSize.width)/cacheStride.width+ 
 1, 582 (winSize.height/cacheStride.height)+ 
 1); 583//blockCache为一个float型的Mat，注意其列数的值584 blockCache.create(cacheSize.height, cacheSize.width* 
 blockHistogramSize);585//blockCacheFlags为一个uchar型的Mat586blockCacheFlags.create(cacheSize); 587 size_t cacheRows = 
 blockCache.rows; 588//ymaxCached为vector<int>类型 589//Mat::resize()为矩阵的一个方法，只是改变矩阵的行数，与单独的resize()函数不相同。590ymaxCached.resize(cacheRows); 591//ymaxCached向量内部全部初始化为-1592for(size_t ii =  
 0; ii < cacheRows; ii++ 
  ) 593 ymaxCached[ii] = - 
 1; 594} 595596//weights为一个尺寸为blockSize的二维高斯表,下面的代码就是计算二维高斯的系数597 Mat_< 
 float> 
  weights(blockSize); 598float sigma = ( 
 float)descriptor-> 
 getWinSigma(); 599float scale =  
 1.f/(sigma*sigma* 
 2); 600601for(i =  
 0; i < blockSize.height; i++ 
 ) 602for(j =  
 0; j < blockSize.width; j++ 
 ) 603{ 604float di = i - blockSize.height* 
 0.5f; 605float dj = j - blockSize.width* 
 0.5f; 606 weights(i,j) = std::exp(-(di*di + dj*dj)* 
 scale); 607} 608609//vector<BlockData> blockData;而BlockData为HOGCache的一个结构体成员 610//nblocks.width*nblocks.height表示一个检测窗口中block的个数， 611//而cacheSize.width*cacheSize.heigh表示一个已经扩充的图片中的block的个数612 blockData.resize(nblocks.width* 
 nblocks.height); 613//vector<PixData> pixData;同理，Pixdata也为HOGCache中的一个结构体成员 614//rawBlockSize表示每个block中像素点的个数 615//resize表示将其转换成列向量616 pixData.resize(rawBlockSize* 
 3); 617618// Initialize 2 lookup tables, pixData & blockData. 619// Here is why: 620//621// The detection algorithm runs in 4 nested loops (at each pyramid layer): 622// loop over the windows within the input image 623// loop over the blocks within each window 624// loop over the cells within each block 625// loop over the pixels in each cell 626//627// As each of the loops runs over a 2-dimensional array, 628// we could get 8(!) nested loops in total, which is very-very slow. 629//630// To speed the things up, we do the following: 631// 1. loop over windows is unrolled in the HOGDescriptor::{compute|detect} methods; 632// inside we compute the current search window using getWindow() method. 633// Yes, it involves some overhead (function call + couple of divisions), 634// but it's tiny in fact. 635// 2. loop over the blocks is also unrolled. Inside we use pre-computed blockData[j] 636// to set up gradient and histogram pointers. 637// 3. loops over cells and pixels in each cell are merged 638// (since there is no overlap between cells, each pixel in the block is processed once) 639// and also unrolled. Inside we use PixData[k] to access the gradient values and 640// update the histogram 641//count1,count2,count4分别表示block中同时对1个cell，2个cell，4个cell有贡献的像素点的个数。642 count1 = count2 = count4 =  
 0; 643for( j =  
 0; j < blockSize.width; j++ 
  ) 644for( i =  
 0; i < blockSize.height; i++ 
  ) 645{ 646 PixData* data =  
 0; 647//cellX和cellY表示的是block内该像素点所在的cell横坐标和纵坐标索引，以小数的形式存在。648float cellX = (j+ 
 0.5f)/cellSize.width -  
 0.5f; 649float cellY = (i+ 
 0.5f)/cellSize.height -  
 0.5f; 650//cvRound返回最接近参数的整数;cvFloor返回不大于参数的整数;cvCeil返回不小于参数的整数 651//icellX0和icellY0表示所在cell坐标索引，索引值为该像素点相邻cell的那个较小的cell索引 652//当然此处就是由整数的形式存在了。 653//按照默认的系数的话，icellX0和icellY0只可能取值-1,0,1,且当i和j<3.5时对应的值才取-1 654//当i和j>11.5时取值为1，其它时刻取值为0(注意i，j最大是15，从0开始的)655int icellX0 = 
  cvFloor(cellX); 656int icellY0 = 
  cvFloor(cellY); 657int icellX1 = icellX0 +  
 1, icellY1 = icellY0 +  
 1; 658//此处的cellx和celly表示的是真实索引值与最近邻cell索引值之间的差， 659//为后面计算同一像素对不同cell中的hist权重的计算。660 cellX -= 
  icellX0; 661 cellY -= 
  icellY0; 662663//满足这个if条件说明icellX0只能为0,也就是说block横坐标在(3.5,11.5)之间时664if( (unsigned)icellX0 < (unsigned)ncells.width &&  
 665 (unsigned)icellX1 < 
  (unsigned)ncells.width ) 666{ 667//满足这个if条件说明icellY0只能为0,也就是说block纵坐标在(3.5,11.5)之间时668if( (unsigned)icellY0 < (unsigned)ncells.height && 
 669 (unsigned)icellY1 < 
  (unsigned)ncells.height ) 670{ 671//同时满足上面2个if语句的像素对4个cell都有权值贡献 672//rawBlockSize表示的是1个block中存储像素点的个数 673//而pixData的尺寸大小为block中像素点的3倍，其定义如下： 674//pixData.resize(rawBlockSize*3); 675//pixData的前面block像素大小的内存为存储只对block中一个cell676//有贡献的pixel；中间block像素大小的内存存储对block中同时2个 677//cell有贡献的pixel；最后面的为对block中同时4个cell都有贡献 678//的pixel679 data = &pixData[rawBlockSize* 
 2 + (count4++ 
 )]; 680//下面计算出的结果为0681data->histOfs[ 
 0] = (icellX0*ncells.height + icellY0)* 
 nbins; 682//为该像素点对cell0的权重683 data->histWeights[ 
 0] = ( 
 1.f - cellX)*( 
 1.f - 
  cellY); 684//下面计算出的结果为18685 data->histOfs[ 
 1] = (icellX1*ncells.height + icellY0)* 
 nbins; 686 data->histWeights[ 
 1] = cellX*( 
 1.f - 
  cellY); 687//下面计算出的结果为9688 data->histOfs[ 
 2] = (icellX0*ncells.height + icellY1)* 
 nbins; 689 data->histWeights[ 
 2] = ( 
 1.f - cellX)* 
 cellY; 690//下面计算出的结果为27691 data->histOfs[ 
 3] = (icellX1*ncells.height + icellY1)* 
 nbins;692 data->histWeights[ 
 3] = cellX* 
 cellY; 693} 694else695//满足这个else条件说明icellY0取-1或者1,也就是说block纵坐标在(0, 3.5) 696//和(11.5, 15)之间. 697//此时的像素点对相邻的2个cell有权重贡献698{ 699 data = &pixData[rawBlockSize + (count2++ 
 )]; 700if( (unsigned)icellY0 < 
  (unsigned)ncells.height ) 701{ 702//(unsigned)-1等于127>2，所以此处满足if条件时icellY0==1； 703//icellY1==1;704 icellY1 = 
  icellY0; 705cellY =  
 1.f - 
  cellY; 706} 707//不满足if条件时，icellY0==-1;icellY1==0; 708//当然了，这2种情况下icellX0==0;icellX1==1;709 data->histOfs[ 
 0] = (icellX0*ncells.height + icellY1)* 
 nbins; 710 data->histWeights[ 
 0] = ( 
 1.f - cellX)* 
 cellY; 711 data->histOfs[ 
 1] = (icellX1*ncells.height + icellY1)* 
 nbins;712 data->histWeights[ 
 1] = cellX* 
 cellY; 713 data->histOfs[ 
 2] = data->histOfs[ 
 3] =  
 0; 714 data->histWeights[ 
 2] = data->histWeights[ 
 3] =  
 0; 715} 716} 717//当block中横坐标满足在(0, 3.5)和(11.5, 15)范围内时，即 718//icellX0==-1或==1719else720{ 721722if( (unsigned)icellX0 < 
  (unsigned)ncells.width )723{ 724//icellX1=icllX0=1;725 icellX1 = 
  icellX0; 726 cellX =  
 1.f - 
  cellX; 727} 728//当icllY0=0时，此时对2个cell有贡献729if( (unsigned)icellY0 < (unsigned)ncells.height &&  
 730 (unsigned)icellY1 < 
 (unsigned)ncells.height ) 731{ 732 data = &pixData[rawBlockSize + (count2++ 
 )]; 733 data->histOfs[ 
 0] = (icellX1*ncells.height + icellY0)* 
 nbins; 734 data->histWeights[ 
 0] = cellX*( 
 1.f - 
  cellY); 735 data->histOfs[ 
 1] = (icellX1*ncells.height + icellY1)* 
 nbins; 736 data->histWeights[ 
 1] = cellX* 
 cellY; 737 data->histOfs[ 
 2] = data->histOfs[ 
 3] =  
 0; 738 data->histWeights[ 
 2] = data->histWeights[ 
 3] =  
 0; 739} 740else741//此时只对自身的cell有贡献742{ 743 data = &pixData[count1++ 
 ]; 744if( (unsigned)icellY0 < 
 (unsigned)ncells.height ) 745{ 746 icellY1 = 
  icellY0; 747 cellY =  
 1.f - 
  cellY; 748} 749 data->histOfs[ 
 0] = (icellX1*ncells.height + icellY1)* 
 nbins; 750 data->histWeights[ 
 0] = cellX* 
 cellY; 751 data->histOfs[ 
 1] = data->histOfs[ 
 2] = data->histOfs[ 
 3] =  
 0; 752 data->histWeights[ 
 1] = data->histWeights[ 
 2] = data->histWeights[ 
 3] =  
 0; 753} 754} 755//为什么每个block中i,j位置的gradOfs和qangleOfs都相同且是如下的计算公式呢？ 756//那是因为输入的_img参数不是代表整幅图片而是检测窗口大小的图片，所以每个 757//检测窗口中关于block的信息可以看做是相同的758 data->gradOfs = (grad.cols*i + j)* 
 2; 759 data->qangleOfs = (qangle.cols*i + j)* 
 2; 760//每个block中i，j位置的权重都是固定的761 data->gradWeight = 
  weights(i,j); 762} 763764//保证所有的点都被扫描了一遍765 assert( count1 + count2 + count4 == 
  rawBlockSize ); 766// defragment pixData 767//将pixData中按照内存排满，这样节省了2/3的内存768for( j =  
 0; j < count2; j++ 
  ) 769 pixData[j + count1] = pixData[j + 
  rawBlockSize]; 770for( j =  
 0; j < count4; j++ 
  ) 771 pixData[j + count1 + count2] = pixData[j + rawBlockSize* 
 2]; 772//此时count2表示至多对2个cell有贡献的所有像素点的个数773 count2 += 
  count1;774//此时count4表示至多对4个cell有贡献的所有像素点的个数775 count4 += 
  count2; 776777//上面是初始化pixData,下面开始初始化blockData 778// initialize blockData779for( j =  
 0; j < nblocks.width; j++ 
  ) 780for( i = 
 0; i < nblocks.height; i++ 
  ) 781{ 782 BlockData& data = blockData[j*nblocks.height + 
  i]; 783//histOfs表示该block对检测窗口贡献的hog描述变量起点在整个 784//变量中的坐标785 data.histOfs = (j*nblocks.height + i)* 
 blockHistogramSize; 786//imgOffset表示该block的左上角在检测窗口中的坐标787 data.imgOffset = Point(j*blockStride.width,i* 
 blockStride.height); 788} 789//一个检测窗口对应一个blockData内存，一个block对应一个pixData内存。790} 791792793//pt为该block左上角在滑动窗口中的坐标，buf为指向检测窗口中blocData的指针794//函数返回一个block描述子的指针795constfloat* HOGCache::getBlock(Point pt,  
 float* 
  buf) 796{ 797float* blockHist = 
  buf; 798 assert(descriptor !=  
 0); 799800 Size blockSize = descriptor-> 
 blockSize; 801pt += 
  imgoffset; 802803 CV_Assert( (unsigned)pt.x <= (unsigned)(grad.cols - blockSize.width) &&  
 804(unsigned)pt.y <= (unsigned)(grad.rows - 
  blockSize.height) ); 805806if( useCache ) 807{ 808//cacheStride可以认为和blockStride是一样的 809//保证所获取到HOGCache是我们所需要的，即在block移动过程中会出现810 CV_Assert( pt.x % cacheStride.width ==  
 0 &&  
 811 pt.y % cacheStride.height ==  
 0 ); 812//cacheIdx表示的是block个数的坐标813 Point cacheIdx(pt.x/ 
 cacheStride.width, 814 (pt.y/cacheStride.height) % 
 blockCache.rows); 815//ymaxCached的长度为一个检测窗口垂直方向上容纳的block个数816if( pt.y != 
 ymaxCached[cacheIdx.y] ) 817{ 818//取出blockCacheFlags的第cacheIdx.y行并且赋值为0819 Mat_<uchar> cacheRow = 
  blockCacheFlags.row(cacheIdx.y); 820 cacheRow = (uchar) 
 0; 821 ymaxCached[cacheIdx.y] = 
  pt.y;822} 823824//blockHist指向该点对应block所贡献的hog描述子向量，初始值为空825 blockHist = &blockCache[cacheIdx.y][cacheIdx.x* 
 blockHistogramSize]; 826 uchar& computedFlag = 
 blockCacheFlags(cacheIdx.y, cacheIdx.x); 827if( computedFlag !=  
 0 ) 828return blockHist; 829computedFlag = (uchar) 
 1;  
 // set it at once, before actual computing830} 831832int k, C1 = count1, C2 = count2, C4 = 
  count4; 833// 834constfloat* gradPtr = ( 
 constfloat*)(grad.data + grad.step*pt.y) + pt.x* 
 2; 835const uchar* qanglePtr = qangle.data + qangle.step*pt.y + pt.x* 
 2; 836837 CV_Assert( blockHist !=  
 0 ); 838#ifdef HAVE_IPP 839ippsZero_32f(blockHist,blockHistogramSize); 840#else841for( k =  
 0; k < blockHistogramSize; k++ 
  ) 842 blockHist[k] =  
 0.f; 843#endif844845const PixData* _pixData = &pixData[ 
 0]; 846847//C1表示只对自己所在cell有贡献的点的个数848for( k =  
 0; k < C1; k++ 
  ) 849{ 850const PixData& pk = 
  _pixData[k]; 851//a表示的是幅度指针852constfloat* a = gradPtr + 
  pk.gradOfs; 853float w = pk.gradWeight*pk.histWeights[ 
 0]; 854//h表示的是相位指针855const uchar* h = qanglePtr + 
 pk.qangleOfs; 856857//幅度有2个通道是因为每个像素点的幅值被分解到了其相邻的两个bin上了 858//相位有2个通道是因为每个像素点的相位的相邻处都有的2个bin的序号859int h0 = h[ 
 0], h1 = h[ 
 1]; 860float* hist = blockHist + pk.histOfs[ 
 0]; 861float t0 = hist[h0] + a[ 
 0]* 
 w; 862float t1 = hist[h1] + a[ 
 1]* 
 w; 863//hist中放的为加权的梯度值864 hist[h0] = t0; hist[h1] = 
  t1; 865} 866867for( ; k < C2; k++ 
  ) 868{ 869const PixData& pk = 
  _pixData[k]; 870constfloat* a = gradPtr + 
  pk.gradOfs; 871float w, t0, t1, a0 = a[ 
 0], a1 = a[ 
 1]; 872const uchar* h = qanglePtr + 
  pk.qangleOfs; 873int h0 = h[ 
 0], h1 = h[ 
 1]; 874875//因为此时的像素对2个cell有贡献，这是其中一个cell的贡献876float* hist = blockHist + pk.histOfs[ 
 0]; 877 w = pk.gradWeight*pk.histWeights[ 
 0]; 878 t0 = hist[h0] + a0* 
 w; 879 t1 = hist[h1] + a1* 
 w; 880 hist[h0] = t0; hist[h1] = 
  t1; 881882//另一个cell的贡献883 hist = blockHist + pk.histOfs[ 
 1]; 884 w = pk.gradWeight*pk.histWeights[ 
 1]; 885 t0 = hist[h0] + a0* 
 w; 886 t1 = hist[h1] + a1* 
 w; 887 hist[h0] = t0; hist[h1] = 
  t1; 888} 889890//和上面类似891for( ; k < C4; k++ 
  ) 892{ 893const PixData& pk = 
 _pixData[k]; 894constfloat* a = gradPtr + 
  pk.gradOfs; 895float w, t0, t1, a0 = a[ 
 0], a1 = a[ 
 1]; 896const uchar* h = qanglePtr + 
  pk.qangleOfs; 897int h0 = h[ 
 0], h1 = h[ 
 1]; 898899float* hist = blockHist + pk.histOfs[ 
 0]; 900 w = pk.gradWeight*pk.histWeights[ 
 0]; 901 t0 = hist[h0] + a0* 
 w; 902 t1 = hist[h1] + a1* 
 w; 903 hist[h0] = t0; hist[h1] = 
  t1; 904905 hist = blockHist + pk.histOfs[ 
 1]; 906 w = pk.gradWeight*pk.histWeights[ 
 1]; 907 t0 = hist[h0] + a0* 
 w; 908 t1 = hist[h1] + a1* 
 w; 909 hist[h0] = t0; hist[h1] = 
  t1; 910911 hist = blockHist + pk.histOfs[ 
 2]; 912 w = pk.gradWeight*pk.histWeights[ 
 2]; 913 t0 = hist[h0] + a0* 
 w; 914 t1 = hist[h1] + a1* 
 w; 915 hist[h0] = t0; hist[h1] = 
  t1; 916917 hist = blockHist + pk.histOfs[ 
 3]; 918 w = pk.gradWeight*pk.histWeights[ 
 3]; 919 t0 = hist[h0] + a0* 
 w; 920 t1 = hist[h1] + a1* 
 w; 921 hist[h0] = t0; hist[h1] = 
  t1; 922} 923924normalizeBlockHistogram(blockHist); 925926returnblockHist; 927} 928929930void HOGCache::normalizeBlockHistogram( 
 float* _hist)  
 const931{ 932float* hist = &_hist[ 
 0]; 933#ifdef HAVE_IPP 934 size_t sz = 
  blockHistogramSize; 935#else936 size_t i, sz = 
 blockHistogramSize; 937#endif938939float sum =  
 0; 940#ifdef HAVE_IPP 941ippsDotProd_32f(hist,hist,sz,& 
 sum); 942#else943//第一次归一化求的是平方和944for( i =  
 0; i < sz; i++ 
  )945 sum += hist[i]* 
 hist[i]; 946#endif947//分母为平方和开根号+0.1948float scale = 
 1.f/(std::sqrt(sum)+sz* 
 0.1f), thresh = ( 
 float)descriptor-> 
 L2HysThreshold; 949#ifdef HAVE_IPP 950ippsMulC_32f_I(scale,hist,sz); 951ippsThreshold_32f_I( hist, sz, thresh, ippCmpGreater ); 952ippsDotProd_32f(hist,hist,sz,& 
 sum); 953#else954for( i =  
 0, sum =  
 0; i < sz; i++ 
  ) 955{ 956//第2次归一化是在第1次的基础上继续求平和和957 hist[i] = std::min(hist[i]* 
 scale, thresh); 958 sum += hist[i]* 
 hist[i];959} 960#endif961962 scale =  
 1.f/(std::sqrt(sum)+ 
 1e-3f); 963#ifdef HAVE_IPP 964ippsMulC_32f_I(scale,hist,sz); 965#else966//最终归一化结果967for( i =  
 0; i < sz; i++ 
  ) 968 hist[i] *= 
 scale; 969#endif970} 971972973//返回测试图片中水平方向和垂直方向共有多少个检测窗口974 Size HOGCache::windowsInImage(Size imageSize, Size winStride)  
 const975{ 976return Size((imageSize.width - winSize.width)/winStride.width +  
 1, 977 (imageSize.height - winSize.height)/winStride.height +  
 1); 978}979980981//给定图片的大小，已经检测窗口滑动的大小和测试图片中的检测窗口的索引，得到该索引处 982//检测窗口的尺寸，包括坐标信息983 Rect HOGCache::getWindow(Size imageSize, Size winStride,  
 int idx)  
 const984{ 985intnwindowsX = (imageSize.width - winSize.width)/winStride.width +  
 1; 986int y = idx / nwindowsX; 
 //商987int x = idx - nwindowsX*y; 
 //余数988return Rect( x*winStride.width, y* 
 winStride.height, winSize.width, winSize.height ); 989} 990991992void HOGDescriptor::compute( 
 const Mat& img, vector< 
 float>& 
 descriptors, 993Size winStride, Size padding, 994const vector<Point>& locations)  
 const995{ 996//Size()表示长和宽都是0997if( winStride == 
  Size() ) 998 winStride = 
  cellSize; 999//gcd为求最大公约数，如果采用默认值的话，则2者相同1000Size cacheStride(gcd(winStride.width, blockStride.width), 1001gcd(winStride.height, blockStride.height)); 1002 size_t nwindows = 
  locations.size(); 1003//alignSize(m, n)返回n的倍数大于等于m的最小值1004 padding.width = ( 
 int)alignSize(std::max(padding.width,  
 0), cacheStride.width); 1005 padding.height = ( 
 int)alignSize(std::max(padding.height,  
 0), cacheStride.height); 1006 Size paddedImgSize(img.cols + padding.width* 
 2, img.rows + padding.height* 
 2);10071008 HOGCache cache( 
 this, img, padding, padding, nwindows ==  
 0, cacheStride); 10091010if( ! 
 nwindows ) 1011//Mat::area()表示为Mat的面积1012 nwindows = 
  cache.windowsInImage(paddedImgSize, winStride).area(); 10131014const HOGCache::BlockData* blockData = &cache.blockData[ 
 0]; 10151016intnblocks = 
  cache.nblocks.area(); 1017int blockHistogramSize = 
  cache.blockHistogramSize; 1018 size_t dsize = getDescriptorSize(); 
 //一个hog的描述长度 1019//resize()为改变矩阵的行数，如果减少矩阵的行数则只保留减少后的 1020//那些行，如果是增加行数，则保留所有的行。 1021//这里将描述子长度扩展到整幅图片1022descriptors.resize(dsize* 
 nwindows); 10231024for( size_t i =  
 0; i < nwindows; i++ 
  ) 1025{ 1026//descriptor为第i个检测窗口的描述子首位置。1027float* descriptor = &descriptors[i* 
 dsize]; 10281029Point pt0; 1030//非空1031if( ! 
 locations.empty() ) 1032{ 1033 pt0 = 
  locations[i]; 1034//非法的点1035if( pt0.x < -padding.width || pt0.x > img.cols + padding.width - winSize.width ||  
 1036 pt0.y < -padding.height || pt0.y > img.rows + padding.height - 
  winSize.height ) 1037continue; 1038} 1039//locations为空1040else1041{ 1042//pt0为没有扩充前图像对应的第i个检测窗口1043 pt0 = cache.getWindow(paddedImgSize, winStride, ( 
 int)i).tl() - 
  Point(padding); 1044 CV_Assert(pt0.x % cacheStride.width ==  
 0 && pt0.y % cacheStride.height ==  
 0); 1045} 10461047for(  
 int j =  
 0; j < nblocks; j++ 
  ) 1048{ 1049const HOGCache::BlockData& bj = 
  blockData[j]; 1050//pt为block的左上角相对检测图片的坐标1051 Point pt = pt0 + 
  bj.imgOffset; 10521053//dst为该block在整个测试图片的描述子的位置1054float* dst = descriptor + 
  bj.histOfs; 1055constfloat* src = 
  cache.getBlock(pt, dst); 1056if( src != 
 dst ) 1057#ifdef HAVE_IPP 1058ippsCopy_32f(src,dst,blockHistogramSize); 1059#else1060for(  
 int k = 
 0; k < blockHistogramSize; k++ 
  ) 1061 dst[k] = 
  src[k]; 1062#endif1063} 1064} 1065} 106610671068void HOGDescriptor::detect( 
 const Mat& 
  img, 1069 vector<Point>& hits, vector< 
 double>& weights,  
 doublehitThreshold, 1070 Size winStride, Size padding,  
 const vector<Point>& locations)  
 const1071{ 1072//hits里面存的是符合检测到目标的窗口的左上角顶点坐标1073hits.clear(); 1074if( svmDetector.empty() ) 1075return; 10761077if( winStride == 
  Size() ) 1078 winStride = 
  cellSize; 1079Size cacheStride(gcd(winStride.width, blockStride.width), 1080gcd(winStride.height, blockStride.height));1081 size_t nwindows = 
  locations.size(); 1082 padding.width = ( 
 int)alignSize(std::max(padding.width,  
 0), cacheStride.width); 1083 padding.height = ( 
 int)alignSize(std::max(padding.height,  
 0), cacheStride.height); 1084 Size paddedImgSize(img.cols + padding.width* 
 2, img.rows + padding.height* 
 2);10851086 HOGCache cache( 
 this, img, padding, padding, nwindows ==  
 0, cacheStride); 10871088if( ! 
 nwindows ) 1089 nwindows = 
  cache.windowsInImage(paddedImgSize, winStride).area(); 10901091constHOGCache::BlockData* blockData = &cache.blockData[ 
 0]; 10921093int nblocks = 
  cache.nblocks.area(); 1094int blockHistogramSize = 
  cache.blockHistogramSize; 1095 size_t dsize = 
  getDescriptorSize(); 10961097double rho = svmDetector.size() > dsize ? svmDetector[dsize] :  
 0; 1098 vector< 
 float> 
 blockHist(blockHistogramSize); 10991100for( size_t i =  
 0; i < nwindows; i++ 
  ) 1101{ 1102Point pt0;1103if( ! 
 locations.empty() ) 1104{ 1105 pt0 = 
  locations[i]; 1106if( pt0.x < -padding.width || pt0.x > img.cols + padding.width - winSize.width ||  
 1107 pt0.y < -padding.height || pt0.y > img.rows + padding.height - 
  winSize.height ) 1108continue; 1109} 1110else1111{ 1112 pt0 = cache.getWindow(paddedImgSize, winStride, ( 
 int)i).tl() - 
  Point(padding); 1113 CV_Assert(pt0.x % cacheStride.width ==  
 0 && pt0.y % cacheStride.height ==  
 0); 1114} 1115double s = 
  rho; 1116//svmVec指向svmDetector最前面那个元素1117constfloat* svmVec = &svmDetector[ 
 0]; 1118#ifdef HAVE_IPP 1119int j;1120#else1121int j, k; 1122#endif1123for( j =  
 0; j < nblocks; j++, svmVec += 
  blockHistogramSize )1124{ 1125const HOGCache::BlockData& bj = 
  blockData[j]; 1126 Point pt = pt0 + 
  bj.imgOffset; 11271128//vec为测试图片pt处的block贡献的描述子指针1129constfloat* vec = cache.getBlock(pt, &blockHist[ 
 0]); 1130#ifdef HAVE_IPP 1131Ipp32f partSum; 1132 ippsDotProd_32f(vec,svmVec,blockHistogramSize,& 
 partSum); 1133s += ( 
 double)partSum; 1134#else1135for( k =  
 0; k <= blockHistogramSize -  
 4; k +=  
 4 ) 1136//const float* svmVec = &svmDetector[0];1137 s += vec[k]*svmVec[k] + vec[k+ 
 1]*svmVec[k+ 
 1] +  
 1138vec[k+ 
 2]*svmVec[k+ 
 2] + vec[k+ 
 3]*svmVec[k+ 
 3]; 1139for( ; k < blockHistogramSize; k++ 
  ) 1140 s += vec[k]* 
 svmVec[k]; 1141#endif1142} 1143if( s >= 
  hitThreshold ) 1144{ 1145hits.push_back(pt0); 1146weights.push_back(s); 1147} 1148} 1149} 11501151//不用保留检测到目标的可信度，即权重1152voidHOGDescriptor::detect( 
 const Mat& img, vector<Point>& hits,  
 double hitThreshold, 1153 Size winStride, Size padding,  
 const vector<Point>& locations)  
 const1154{ 1155 vector< 
 double> 
  weightsV; 1156detect(img, hits, weightsV, hitThreshold, winStride, padding, locations); 1157} 11581159structHOGInvoker 1160{ 1161 HOGInvoker(  
 const HOGDescriptor* _hog,  
 const Mat& 
  _img, 1162double_hitThreshold, Size _winStride, Size _padding, 1163constdouble* _levelScale, ConcurrentRectVector* 
 _vec, 1164 ConcurrentDoubleVector* _weights= 
 0, ConcurrentDoubleVector* _scales= 
 0 ) 1165{ 1166 hog = 
 _hog; 1167 img = 
  _img; 1168 hitThreshold = 
  _hitThreshold; 1169 winStride = 
  _winStride; 1170 padding = 
 _padding; 1171 levelScale = 
  _levelScale; 1172 vec = 
  _vec; 1173 weights = 
  _weights; 1174 scales = 
 _scales; 1175} 11761177voidoperator()(  
 const BlockedRange& range )  
 const1178{ 1179int i, i1 = range.begin(), i2 = 
  range.end(); 1180double minScale = i1 >  
 0 ? levelScale[i1] : i2 >  
 1 ? levelScale[i1+ 
 1] : std::max(img.cols, img.rows); 1181//将原图片进行缩放1182 Size maxSz(cvCeil(img.cols/minScale), cvCeil(img.rows/ 
 minScale)); 1183Mat smallerImgBuf(maxSz, img.type());1184 vector<Point> 
  locations; 1185 vector< 
 double> 
  hitsWeights; 11861187for( i = i1; i < i2; i++ 
  ) 1188{ 1189double scale = 
  levelScale[i]; 1190 Size sz(cvRound(img.cols/scale), cvRound(img.rows/ 
 scale));1191//smallerImg只是构造一个指针，并没有复制数据1192Mat smallerImg(sz, img.type(), smallerImgBuf.data);1193//没有尺寸缩放1194if( sz == 
  img.size() ) 1195 smallerImg = 
  Mat(sz, img.type(), img.data, img.step); 1196//有尺寸缩放1197else1198resize(img, smallerImg, sz); 1199//该函数实际上是将返回的值存在locations和histWeights中 1200//其中locations存的是目标区域的左上角坐标1201 hog-> 
 detect(smallerImg, locations, hitsWeights, hitThreshold, winStride, padding); 1202 Size scaledWinSize = Size(cvRound(hog->winSize.width*scale), cvRound(hog->winSize.height* 
 scale)); 1203for( size_t j =  
 0; j < locations.size(); j++ 
  ) 1204{ 1205//保存目标区域1206 vec->push_back(Rect(cvRound(locations[j].x* 
 scale), 1207 cvRound(locations[j].y* 
 scale), 1208scaledWinSize.width, scaledWinSize.height)); 1209//保存缩放尺寸1210if (scales) { 1211 scales-> 
 push_back(scale); 1212} 1213} 1214//保存svm计算后的结果值1215if (weights && (! 
 hitsWeights.empty()))1216{ 1217for (size_t j =  
 0; j < locations.size(); j++ 
 ) 1218{ 1219 weights-> 
 push_back(hitsWeights[j]); 1220} 1221} 1222} 1223} 12241225const HOGDescriptor* 
  hog; 1226Mat img; 1227double hitThreshold; 1228Size winStride; 1229Size padding; 1230constdouble* 
  levelScale;1231//typedef tbb::concurrent_vector<Rect> ConcurrentRectVector;1232 ConcurrentRectVector* 
  vec; 1233//typedef tbb::concurrent_vector<double> ConcurrentDoubleVector;1234 ConcurrentDoubleVector* 
  weights;1235 ConcurrentDoubleVector* 
  scales; 1236}; 123712381239void HOGDescriptor::detectMultiScale( 1240const Mat& img, vector<Rect>& foundLocations, vector< 
 double>& 
  foundWeights, 1241double hitThreshold, Size winStride, Size padding, 1242double scale0,  
 double finalThreshold,  
 bool useMeanshiftGrouping) 
 const1243{ 1244double scale =  
 1.; 1245int levels =  
 0; 12461247 vector< 
 double> 
  levelScale; 12481249//nlevels默认的是64层1250for( levels =  
 0; levels < nlevels; levels++ 
  ) 1251{ 1252levelScale.push_back(scale); 1253if( cvRound(img.cols/scale) < winSize.width ||  
 1254cvRound(img.rows/scale) < winSize.height ||  
 1255 scale0 <=  
 1 ) 1256break; 1257//只考虑测试图片尺寸比检测窗口尺寸大的情况1258 scale *= 
  scale0; 1259} 1260 levels = std::max(levels,  
 1); 1261levelScale.resize(levels); 12621263ConcurrentRectVector allCandidates; 1264ConcurrentDoubleVector tempScales; 1265ConcurrentDoubleVector tempWeights; 1266 vector< 
 double> 
  foundScales; 12671268//TBB并行计算1269 parallel_for(BlockedRange( 
 0, ( 
 int)levelScale.size()), 1270 HOGInvoker( 
 this, img, hitThreshold, winStride, padding, &levelScale[ 
 0], &allCandidates, &tempWeights, & 
 tempScales)); 1271//将tempScales中的内容复制到foundScales中；back_inserter是指在指定参数迭代器的末尾插入数据1272std::copy(tempScales.begin(), tempScales.end(), back_inserter(foundScales)); 1273//容器的clear()方法是指移除容器中所有的数据1274foundLocations.clear(); 1275//将候选目标窗口保存在foundLocations中1276std::copy(allCandidates.begin(), allCandidates.end(), back_inserter(foundLocations)); 1277foundWeights.clear(); 1278//将候选目标可信度保存在foundWeights中1279std::copy(tempWeights.begin(), tempWeights.end(), back_inserter(foundWeights)); 12801281if ( useMeanshiftGrouping ) 1282{ 1283groupRectangles_meanshift(foundLocations, foundWeights, foundScales, finalThreshold, winSize); 1284}1285else1286{ 1287//对矩形框进行聚类1288 groupRectangles(foundLocations, ( 
 int)finalThreshold,  
 0.2);1289} 1290} 12911292//不考虑目标的置信度1293void HOGDescriptor::detectMultiScale( 
 const Mat& img, vector<Rect>& 
  foundLocations, 1294double hitThreshold, Size winStride, Size padding, 1295doublescale0,  
 double finalThreshold,  
 bool useMeanshiftGrouping)  
 const1296{ 1297 vector< 
 double> 
  foundWeights;1298detectMultiScale(img, foundLocations, foundWeights, hitThreshold, winStride, 1299padding, scale0, finalThreshold, useMeanshiftGrouping); 1300} 13011302 typedef RTTIImpl<HOGDescriptor> 
  HOGRTTI; 13031304CvType hog_type( CV_TYPE_NAME_HOG_DESCRIPTOR, HOGRTTI::isInstance, 1305HOGRTTI::release, HOGRTTI::read, HOGRTTI::write, HOGRTTI::clone); 13061307 vector< 
 float> 
 HOGDescriptor::getDefaultPeopleDetector() 1308{ 1309staticconstfloat detector[] = 
  { 13100.05359386f, - 
 0.14721455f, - 
 0.05532170f,  
 0.05077307f, 13110.11547081f, - 
 0.04268804f,  
 0.04635834f, ........ 1312}; 1313//返回detector数组的从头到尾构成的向量1314return vector< 
 float>(detector, detector + 
 sizeof(detector)/ 
 sizeof(detector[ 
 0])); 1315} 1316//This function renurn 1981 SVM coeffs obtained from daimler's base. 1317//To use these coeffs the detection window size should be (48,96) 1318vector< 
 float> 
  HOGDescriptor::getDaimlerPeopleDetector() 1319{ 1320staticconstfloat detector[] = 
  {13210.294350f, - 
 0.098796f, - 
 0.129522f,  
 0.078753f, 13220.387527f,  
 0.261529f,  
 0.145939f,  
 0.061520f, 1323........ 1324}; 1325//返回detector的首尾构成的向量1326return vector< 
 float>(detector, detector + 
 sizeof(detector)/ 
 sizeof(detector[ 
 0])); 1327} 13281329 } 
  
   
 

objdetect.hpp中关于hog的部分:

 
   
 
1//// HOG (Histogram-of-Oriented-Gradients) Descriptor and Object Detector //23struct CV_EXPORTS_W HOGDescriptor 4{ 5public: 6enum { L2Hys= 
 0 }; 7enum { DEFAULT_NLEVELS= 
 64 }; 89CV_WRAP HOGDescriptor() : winSize( 
 64, 
 128), blockSize( 
 16, 
 16), blockStride( 
 8, 
 8), 10 cellSize( 
 8, 
 8), nbins( 
 9), derivAperture( 
 1), winSigma(- 
 1), 11 histogramNormType(HOGDescriptor::L2Hys), L2HysThreshold( 
 0.2), gammaCorrection( 
 true), 12nlevels(HOGDescriptor::DEFAULT_NLEVELS) 13{} 1415//可以用构造函数的参数来作为冒号外的参数初始化传入，这样定义该类的时候，一旦变量分配了 16//内存，则马上会被初始化，而不用等所有变量分配完内存后再初始化。17CV_WRAP HOGDescriptor(Size _winSize, Size _blockSize, Size _blockStride,18 Size _cellSize,  
 int _nbins,  
 int _derivAperture= 
 1,  
 double _winSigma=- 
 1, 19int_histogramNormType= 
 HOGDescriptor::L2Hys, 20double _L2HysThreshold= 
 0.2,  
 bool _gammaCorrection= 
 false, 21int _nlevels= 
 HOGDescriptor::DEFAULT_NLEVELS) 22: winSize(_winSize), blockSize(_blockSize), blockStride(_blockStride), cellSize(_cellSize), 23nbins(_nbins), derivAperture(_derivAperture), winSigma(_winSigma), 24histogramNormType(_histogramNormType), L2HysThreshold(_L2HysThreshold), 25gammaCorrection(_gammaCorrection), nlevels(_nlevels) 26{} 2728//可以导入文本文件进行初始化29 CV_WRAP HOGDescriptor( 
 const String& 
  filename) 30{ 31load(filename); 32} 3334 HOGDescriptor( 
 constHOGDescriptor& 
  d) 35{ 36 d.copyTo(* 
 this); 37} 3839virtual ~ 
 HOGDescriptor() {} 4041//size_t是一个long unsigned int型42 CV_WRAP size_t getDescriptorSize()  
 const; 43 CV_WRAP  
 bool checkDetectorSize() 
 const; 44 CV_WRAP  
 double getWinSigma()  
 const; 4546//virtual为虚函数，在指针或引用时起函数多态作用47CV_WRAP  
 virtualvoid setSVMDetector(InputArray _svmdetector); 4849virtualbool read(FileNode& 
  fn); 50virtualvoid write(FileStorage& fs,  
 const String& objname)  
 const; 5152 CV_WRAP  
 virtualbool load( 
 constString& filename,  
 const String& objname= 
 String()); 53 CV_WRAP  
 virtualvoid save( 
 const String& filename, 
 const String& objname=String())  
 const; 54virtualvoid copyTo(HOGDescriptor& c)  
 const; 5556 CV_WRAP 
 virtualvoid compute( 
 const Mat& 
  img, 57 CV_OUT vector< 
 float>& 
  descriptors, 58 Size winStride=Size(), Size padding= 
 Size(), 59const vector<Point>& locations=vector<Point>())  
 const; 60//with found weights output61 CV_WRAP  
 virtualvoid detect( 
 const Mat& img, CV_OUT vector<Point>& 
  foundLocations, 62 CV_OUT vector< 
 double>& 
  weights, 63double hitThreshold= 
 0, Size winStride= 
 Size(), 64 Size padding= 
 Size(), 65const vector<Point>& searchLocations=vector<Point>())  
 const; 66//without found weights output67virtualvoid detect( 
 const Mat& img, CV_OUT vector<Point>& 
  foundLocations, 68double hitThreshold= 
 0, Size winStride= 
 Size(), 69 Size padding= 
 Size(), 70const vector<Point>& searchLocations=vector<Point>()) 
 const; 71//with result weights output72 CV_WRAP  
 virtualvoid detectMultiScale( 
 const Mat& img, CV_OUT vector<Rect>& 
  foundLocations, 73 CV_OUT vector< 
 double>& foundWeights,  
 double hitThreshold= 
 0, 74 Size winStride=Size(), Size padding=Size(),  
 double scale= 
 1.05, 75double finalThreshold= 
 2.0, 
 booluseMeanshiftGrouping =  
 false)  
 const; 76//without found weights output77virtualvoiddetectMultiScale( 
 const Mat& img, CV_OUT vector<Rect>& 
  foundLocations, 78double hitThreshold= 
 0, Size winStride= 
 Size(), 79 Size padding=Size(),  
 double scale= 
 1.05, 80double finalThreshold= 
 2.0,  
 booluseMeanshiftGrouping =  
 false)  
 const; 8182 CV_WRAP  
 virtualvoid computeGradient( 
 const Mat& img, CV_OUT Mat& grad, CV_OUT Mat& 
  angleOfs, 83 Size paddingTL=Size(), Size paddingBR=Size())  
 const; 8485 CV_WRAP 
 static vector< 
 float> 
  getDefaultPeopleDetector(); 86 CV_WRAP  
 static vector< 
 float> 
 getDaimlerPeopleDetector(); 8788CV_PROP Size winSize; 89CV_PROP Size blockSize; 90CV_PROP Size blockStride; 91CV_PROP Size cellSize; 92 CV_PROP  
 int nbins; 93 CV_PROP  
 int derivAperture; 94 CV_PROP 
 double winSigma; 95 CV_PROP  
 int histogramNormType; 96 CV_PROP  
 double L2HysThreshold; 97 CV_PROP  
 boolgammaCorrection; 98 CV_PROP vector< 
 float> 
  svmDetector; 99 CV_PROP  
 int nlevels; 100 };