转自:http://www.cnblogs.com/ryanlaw/archive/2012/02/05/2339250.html
HOG descriptors 是应用在计算机视觉和图像处理领域,用于目标检测的特征描述器。这项技术是用来计算局部图像梯度的方向信息的统计值。这种方法跟边缘方向直方图(edge orientation histograms)、尺度不变特征变换(scale-invariant feature transform descriptors) 以及形状上下文方法( shape contexts)有很多相似之处,但与它们的不同点是:HOG描述器是在一个网格密集的大小统一的细胞单元(dense grid of uniformly spaced cells)上计算,而且为了提高性能,还采用了重叠的局部对比度归一化(overlapping local contrast normalization)技术。
这篇文章的作者Navneet Dalal和Bill Triggs是法国国家计算机技术和控制研究所French National Institute for Research in Computer Science and Control (INRIA)的研究员。他们在这篇文章中首次提出了HOG方法。这篇文章被发表在2005年的CVPR上。他们主要是将这种方法应用在静态图像中的行人 检测上,但在后来,他们也将其应用在电影和视频中的行人检测,以及静态图像中的车辆和常见动物的检测。
HOG描述器最重要的思想是:在一副 图像中,局部目标的表象和形状(appearance and shape)能够被梯度或边缘的方向密度分布很好地描述。具体的实现方法是:首先将图像分成小的连通区域,我们把它叫细胞单元。然后采集细胞单元中各像素 点的梯度的或边缘的方向直方图。最后把这些直方图组合起来就可以构成特征描述器。为了提高性能,我们还可以把这些局部直方图在图像的更大的范围内(我们把 它叫区间或block)进行对比度归一化(contrast-normalized),所采用的方法是:先计算各直方图在这个区间(block)中的密 度,然后根据这个密度对区间中的各个细胞单元做归一化。通过这个归一化后,能对光照变化和阴影获得更好的效果。
与其他的特征描述方法相 比,HOG描述器后很多优点。首先,由于HOG方法是在图像的局部细胞单元上操作,所以它对图像几何的(geometric)和光学的 (photometric)形变都能保持很好的不变性,这两种形变只会出现在更大的空间领域上。其次,作者通过实验发现,在粗的空域抽样(coarse spatial sampling)、精细的方向抽样(fine orientation sampling)以及较强的局部光学归一化(strong local photometric normalization)等条件下,只要行人大体上能够保持直立的姿势,就容许行人有一些细微的肢体动作,这些细微的动作可以被忽略而不影响检测效 果。综上所述,HOG方法是特别适合于做图像中的行人检测的。
上图是作者做的行人检测试验,其中(a)表示所有训练图像集 的平均梯度(average gradient across their training images);(b)和(c)分别表示:图像中每一个区间(block)上的最大最大正、负SVM权值;(d)表示一副测试图像;(e)计算完R- HOG后的测试图像;(f)和(g)分别表示被正、负SVM权值加权后的R-HOG图像。
算法的实现:
色彩和伽马归一化 (color and gamma normalization)
作者分别在灰度空间、RGB色彩空间和LAB色彩空间上对图像进行色彩和 伽马归一化,但实验结果显示,这个归一化的预处理工作对最后的结果没有影响,原因可能是:在后续步骤中也有归一化的过程,那些过程可以取代这个预处理的归 一化。所以,在实际应用中,这一步可以省略。
梯度的计算(Gradient computation)
最常用的方法是:简单 地使用一个一维的离散微分模板(1-D centered point discrete derivative mask)在一个方向上或者同时在水平和垂直两个方向上对图像进行处理,更确切地说,这个方法需要使用下面的滤波器核滤除图像中的色彩或变化剧烈的数据 (color or intensity data)
作者也尝试了其他一些更复杂的模板,如3×3 Sobel 模板,或对角线模板(diagonal masks),但是在这个行人检测的实验中,这些复杂模板的表现都较差,所以作者的结论是:模板越简单,效果反而越好。作者也尝试了在使用微分模板前加入 一个高斯平滑滤波,但是这个高斯平滑滤波的加入使得检测效果更差,原因是:许多有用的图像信息是来自变化剧烈的边缘,而在计算梯度之前加入高斯滤波会把这 些边缘滤除掉。
构建方向的直方图(creating the orientation histograms)
第三步就是为 图像的每个细胞单元构建梯度方向直方图。细胞单元中的每一个像素点都为某个基于方向的直方图通道(orientation-based histogram channel)投票。投票是采取加权投票(weighted voting)的方式,即每一票都是带权值的,这个权值是根据该像素点的梯度幅度计算出来。可以采用幅值本身或者它的函数来表示这个权值,实际测试表明: 使用幅值来表示权值能获得最佳的效果,当然,也可以选择幅值的函数来表示,比如幅值的平方根(square root)、幅值的平方(square of the gradient magnitude)、幅值的截断形式(clipped version of the magnitude)等。细胞单元可以是矩形的(rectangular),也可以是星形的(radial)。直方图通道是平均分布在0-1800(无 向)或0-3600(有向)范围内。作者发现,采用无向的梯度和9个直方图通道,能在行人检测试验中取得最佳的效果。
把细胞单元组 合成大的区间(grouping the cells together into larger blocks)
由于局部光照的变化 (variations of illumination)以及前景-背景对比度(foreground-background contrast)的变化,使得梯度强度(gradient strengths)的变化范围非常大。这就需要对梯度强度做归一化,作者采取的办法是:把各个细胞单元组合成大的、空间上连通的区间(blocks)。 这样以来,HOG描述器就变成了由各区间所有细胞单元的直方图成分所组成的一个向量。这些区间是互有重叠的,这就意味着:每一个细胞单元的输出都多次作用 于最终的描述器。区间有两个主要的几何形状——矩形区间(R-HOG)和环形区间(C-HOG)。R-HOG区间大体上是一些方形的格子,它可以有三个参 数来表征:每个区间中细胞单元的数目、每个细胞单元中像素点的数目、每个细胞的直方图通道数目。作者通过实验表明,行人检测的最佳参数设置是:3×3细胞 /区间、6×6像素/细胞、9个直方图通道。作者还发现,在对直方图做处理之前,给每个区间(block)加一个高斯空域窗口(Gaussian spatial window)是非常必要的,因为这样可以降低边缘的周围像素点(pixels around the edge)的权重。
R- HOG跟SIFT描述器看起来很相似,但他们的不同之处是:R-HOG是在单一尺度下、密集的网格内、没有对方向排序的情况下被计算出来(are computed in dense grids at some single scale without orientation alignment);而SIFT描述器是在多尺度下、稀疏的图像关键点上、对方向排序的情况下被计算出来(are computed at sparse scale-invariant key image points and are rotated to align orientation)。补充一点,R-HOG是各区间被组合起来用于对空域信息进行编码(are used in conjunction to encode spatial form information),而SIFT的各描述器是单独使用的(are used singly)。
C- HOG区间(blocks)有两种不同的形式,它们的区别在于:一个的中心细胞是完整的,一个的中心细胞是被分割的。如右图所示:
作者发现 C-HOG的这两种形式都能取得相同的效果。C-HOG区间(blocks)可以用四个参数来表征:角度盒子的个数(number of angular bins)、半径盒子个数(number of radial bins)、中心盒子的半径(radius of the center bin)、半径的伸展因子(expansion factor for the radius)。通过实验,对于行人检测,最佳的参数设置为:4个角度盒子、2个半径盒子、中心盒子半径为4个像素、伸展因子为2。前面提到过,对于R- HOG,中间加一个高斯空域窗口是非常有必要的,但对于C-HOG,这显得没有必要。C-HOG看起来很像基于形状上下文(Shape Contexts)的方法,但不同之处是:C-HOG的区间中包含的细胞单元有多个方向通道(orientation channels),而基于形状上下文的方法仅仅只用到了一个单一的边缘存在数(edge presence count)。
区间归一化 (Block normalization)
作者采用了四中不同的方法对区间进行归一化,并对结果进行了比较。引入v表示一个还没有被归一 化的向量,它包含了给定区间(block)的所有直方图信息。| | vk | |表示v的k阶范数,这里的k去1、2。用e表示一个很小的常数。这时,归一化因子可以表示如下:
L2-norm:
L1-norm:
L1-sqrt:
还 有第四种归一化方式:L2-Hys,它可以通过先进行L2-norm,对结果进行截短(clipping),然后再重新归一化得到。作者发现:采用L2- Hys L2-norm 和 L1-sqrt方式所取得的效果是一样的,L1-norm稍微表现出一点点不可靠性。但是对于没有被归一化的数据来说,这四种方法都表现出来显着的改进。
SVM 分类器(SVM classifier)
最后一步就是把提取的HOG特征输入到SVM分类器中,寻找一个最优超平面作为决策函数。作者采用 的方法是:使用免费的SVMLight软件包加上HOG分类器来寻找测试图像中的行人。
代码及代码解释来自 http://hi.baidu.com/nokltkmtsfbnsyq/item/f4b73d06f066cd193a53eec3
- I=imread('05598.jpg');
- c=imresize(I,[64 64],'bilinear');
- f=hogcalculator(c,8,8,2,2,9,0.5,'localinterpolate','unsigned','l2hys');
I=imread('05598.jpg');
c=imresize(I,[64 64],'bilinear');
f=hogcalculator(c,8,8,2,2,9,0.5,'localinterpolate','unsigned','l2hys');
函数举例,将读取图像变成64*64,然后block=(8*2)*(8*2)=16*16 ,cell=8*8 bins=9
- function F = hogcalculator(img, cellpw, cellph, nblockw, nblockh,...
- nthet, overlap, isglobalinterpolate, issigned, normmethod)
- % HOGCALCULATOR calculate R-HOG feature vector of an input image using the
- % procedure presented in Dalal and Triggs's paper in CVPR 2005.
- %
- % Author: timeHandle
- % Time: March 24, 2010
- % May 12,2010 update.
- %
- % this copy of code is written for my personal interest, which is an
- % original and inornate realization of [Dalal CVPR2005]'s algorithm
- % without any optimization. I just want to check whether I understand
- % the algorithm really or not, and also do some practices for knowing
- % matlab programming more well because I could be called as 'novice'.
- % OpenCV 2.0 has realized Dalal's HOG algorithm which runs faster
- % than mine without any doubt, ╮(╯▽╰)╭ . Ronan pointed a error in
- % the code,thanks for his correction. Note that at the end of this
- % code, there are some demonstration code,please remove in your work.
- %
- % F = hogcalculator(img, cellpw, cellph, nblockw, nblockh,
- % nthet, overlap, isglobalinterpolate, issigned, normmethod)
- %
- % IMG:
- % IMG is the input image.
- %
- % CELLPW, CELLPH:
- % CELLPW and CELLPH are cell's pixel width and height respectively.
- %
- % NBLOCKW, NBLCOKH:
- % NBLOCKW and NBLCOKH are block size counted by cells number in x and
- % y directions respectively.
- %
- % NTHET, ISSIGNED:
- % NTHET is the number of the bins of the histogram of oriented
- % gradient. The histogram of oriented gradient ranges from 0 to pi in
- % 'unsigned' condition while to 2*pi in 'signed' condition, which can
- % be specified through setting the value of the variable ISSIGNED by
- % the string 'unsigned' or 'signed'.
- %
- % OVERLAP:
- % OVERLAP is the overlap proportion of two neighboring block.
- %
- % ISGLOBALINTERPOLATE:
- % ISGLOBALINTERPOLATE specifies whether the trilinear interpolation(三线性插值)
- % is done in a single global 3d histogram of the whole detecting
- % window by the string 'globalinterpolate' or in each local 3d
- % histogram corresponding to respective blocks by the string
- % 'localinterpolate' which is in strict accordance with the procedure
- % proposed in Dalal's paper. Interpolating in the whole detecting
- % window requires the block's sliding step to be an integral multiple
- % of cell's width and height because the histogram is fixing before
- % interpolate. In fact here the so called 'global interpolation' is
- % a notation given by myself. at first the spatial interpolation is
- % done without any relevant to block's slide position, but when I was
- % doing calculation while OVERLAP is 0.75, something occurred and
- % confused me o__O"… . This let me find that the operation I firstly
- % did is different from which mentioned in Dalal's paper. But this
- % does not mean it is incorrect ^◎^, so I reserve this. As for name,
- % besides 'global interpolate', any others would be all ok, like 'Lady GaGa'
- % or what else, :-).
- %
- % NORMMETHOD:
- % NORMMETHOD is the block histogram normalized method which can be
- % set as one of the following strings:
- % 'none', which means non-normalization;
- % 'l1', which means L1-norm normalization;
- % 'l2', which means L2-norm normalization;
- % 'l1sqrt', which means L1-sqrt-norm normalization;
- % 'l2hys', which means L2-hys-norm normalization.
- % F:
- % F is a row vector storing the final histogram of all of the blocks
- % one by one in a top-left to bottom-right image scan manner, the
- % cells histogram are stored in the same manner in each block's
- % section of F.
- %
- % Note that CELLPW*NBLOCKW and CELLPH*NBLOCKH should be equal to IMG's
- % width and height respectively.
- %
- % Here is a demonstration, which all of parameters are set as the
- % best value mentioned in Dalal's paper when the window detected is 128*64
- % size(128 rows, 64 columns):
- % F = hogcalculator(window, 8, 8, 2, 2, 9, 0.5,
- % 'localinterpolate', 'unsigned', 'l2hys');
- % Also the function can be called like:
- % F = hogcalculator(window);
- % the other parameters are all set by using the above-mentioned "dalal's
- % best value" as default.
- %
- if nargin < 2
- % set default parameters value.
- cellpw = 2;
- cellph = 2;
- nblockw = 2;
- nblockh = 2;
- nthet = 9;
- overlap = 0.5;
- isglobalinterpolate = 'localinterpolate';
- issigned = 'unsigned';
- normmethod = 'l2hys';
- else
- if nargin < 10
- error('Input parameters are not enough.');
- end
- end
- % check parameters's validity.
- [M, N, K] = size(img);
- if mod(M,cellph*nblockh) ~= 0
- error('IMG''s height should be an integral multiple of CELLPH*NBLOCKH.');
- end
- if mod(N,cellpw*nblockw) ~= 0
- error('IMG''s width should be an integral multiple of CELLPW*NBLOCKW.');
- end
- if mod((1-overlap)*cellpw*nblockw, cellpw) ~= 0 ||...
- mod((1-overlap)*cellph*nblockh, cellph) ~= 0
- str1 = 'Incorrect OVERLAP or ISGLOBALINTERPOLATE parameter';
- str2 = ', slide step should be an intergral multiple of cell size';
- error([str1, str2]);
- end
- % set the standard deviation of gaussian spatial weight window.
- delta = cellpw*nblockw * 0.5;
- % calculate gradient scale matrix.
- hx = [-1,0,1];
- hy = -hx';
- gradscalx = imfilter(double(img),hx);
- gradscaly = imfilter(double(img),hy);
- if K > 1
- gradscalx = max(max(gradscalx(:,:,1),gradscalx(:,:,2)), gradscalx(:,:,3));
- gradscaly = max(max(gradscaly(:,:,1),gradscaly(:,:,2)), gradscaly(:,:,3));
- end
- gradscal = sqrt(double(gradscalx.*gradscalx + gradscaly.*gradscaly));
- % calculate gradient orientation matrix.
- % plus small number for avoiding dividing zero.
- gradscalxplus = gradscalx+ones(size(gradscalx))*0.0001;
- gradorient = zeros(M,N);
- % unsigned situation: orientation region is 0 to pi.
- if strcmp(issigned, 'unsigned') == 1
- gradorient =...
- atan(gradscaly./gradscalxplus) + pi/2;
- or = 1;
- else
- % signed situation: orientation region is 0 to 2*pi.
- if strcmp(issigned, 'signed') == 1
- idx = find(gradscalx >= 0 & gradscaly >= 0);
- gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx));
- idx = find(gradscalx < 0);
- gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx)) + pi;
- idx = find(gradscalx >= 0 & gradscaly < 0);
- gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx)) + 2*pi;
- or = 2;
- else
- error('Incorrect ISSIGNED parameter.');
- end
- end
- % calculate block slide step.
- xbstride = cellpw*nblockw*(1-overlap);
- ybstride = cellph*nblockh*(1-overlap);
- xbstridend = N - cellpw*nblockw + 1;
- ybstridend = M - cellph*nblockh + 1;
- % calculate the total blocks number in the window detected, which is
- % ntotalbh*ntotalbw.
- ntotalbh = ((M-cellph*nblockh)/ybstride)+1;
- ntotalbw = ((N-cellpw*nblockw)/xbstride)+1;
- % generate the matrix hist3dbig for storing the 3-dimensions histogram. the
- % matrix covers the whole image in the 'globalinterpolate' condition or
- % covers the local block in the 'localinterpolate' condition. The matrix is
- % bigger than the area where it covers by adding additional elements
- % (corresponding to the cells) to the surround for calculation convenience.
- if strcmp(isglobalinterpolate, 'globalinterpolate') == 1
- ncellx = N / cellpw;
- ncelly = M / cellph;
- hist3dbig = zeros(ncelly+2, ncellx+2, nthet+2);
- F = zeros(1, (M/cellph-1)*(N/cellpw-1)*nblockw*nblockh*nthet);
- glbalinter = 1;
- else
- if strcmp(isglobalinterpolate, 'localinterpolate') == 1
- hist3dbig = zeros(nblockh+2, nblockw+2, nthet+2);
- F = zeros(1, ntotalbh*ntotalbw*nblockw*nblockh*nthet);
- glbalinter = 0;
- else
- error('Incorrect ISGLOBALINTERPOLATE parameter.')
- end
- end
- % generate the matrix for storing histogram of one block;
- sF = zeros(1, nblockw*nblockh*nthet);
- % generate gaussian spatial weight.
- [gaussx, gaussy] = meshgrid(0:(cellpw*nblockw-1), 0:(cellph*nblockh-1));
- weight = exp(-((gaussx-(cellpw*nblockw-1)/2)...
- .*(gaussx-(cellpw*nblockw-1)/2)+(gaussy-(cellph*nblockh-1)/2)...
- .*(gaussy-(cellph*nblockh-1)/2))/(delta*delta));
- % vote for histogram. there are two situations according to the interpolate
- % condition('global' interpolate or local interpolate). The hist3d which is
- % generated from the 'bigger' matrix hist3dbig is the final histogram.
- if glbalinter == 1
- xbstep = nblockw*cellpw;
- ybstep = nblockh*cellph;
- else
- xbstep = xbstride;
- ybstep = ybstride;
- end
- % block slide loop
- for btly = 1:ybstep:ybstridend
- for btlx = 1:xbstep:xbstridend
- for bi = 1:(cellph*nblockh)
- for bj = 1:(cellpw*nblockw)
- i = btly + bi - 1;
- j = btlx + bj - 1;
- gaussweight = weight(bi,bj);
- gs = gradscal(i,j);
- go = gradorient(i,j);
- if glbalinter == 1
- jorbj = j;
- iorbi = i;
- else
- jorbj = bj;
- iorbi = bi;
- end
- % calculate bin index of hist3dbig
- binx1 = floor((jorbj-1+cellpw/2)/cellpw) + 1;
- biny1 = floor((iorbi-1+cellph/2)/cellph) + 1;
- binz1 = floor((go+(or*pi/nthet)/2)/(or*pi/nthet)) + 1;
- if gs == 0
- continue;
- end
- binx2 = binx1 + 1;
- biny2 = biny1 + 1;
- binz2 = binz1 + 1;
- x1 = (binx1-1.5)*cellpw + 0.5;
- y1 = (biny1-1.5)*cellph + 0.5;
- z1 = (binz1-1.5)*(or*pi/nthet);
- % trilinear interpolation.
- hist3dbig(biny1,binx1,binz1) =...
- hist3dbig(biny1,binx1,binz1) + gs*gaussweight...
- * (1-(jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...
- *(1-(go-z1)/(or*pi/nthet));
- hist3dbig(biny1,binx1,binz2) =...
- hist3dbig(biny1,binx1,binz2) + gs*gaussweight...
- * (1-(jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...
- *((go-z1)/(or*pi/nthet));
- hist3dbig(biny2,binx1,binz1) =...
- hist3dbig(biny2,binx1,binz1) + gs*gaussweight...
- * (1-(jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...
- *(1-(go-z1)/(or*pi/nthet));
- hist3dbig(biny2,binx1,binz2) =...
- hist3dbig(biny2,binx1,binz2) + gs*gaussweight...
- * (1-(jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...
- *((go-z1)/(or*pi/nthet));
- hist3dbig(biny1,binx2,binz1) =...
- hist3dbig(biny1,binx2,binz1) + gs*gaussweight...
- * ((jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...
- *(1-(go-z1)/(or*pi/nthet));
- hist3dbig(biny1,binx2,binz2) =...
- hist3dbig(biny1,binx2,binz2) + gs*gaussweight...
- * ((jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...
- *((go-z1)/(or*pi/nthet));
- hist3dbig(biny2,binx2,binz1) =...
- hist3dbig(biny2,binx2,binz1) + gs*gaussweight...
- * ((jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...
- *(1-(go-z1)/(or*pi/nthet));
- hist3dbig(biny2,binx2,binz2) =...
- hist3dbig(biny2,binx2,binz2) + gs*gaussweight...
- * ((jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...
- *((go-z1)/(or*pi/nthet));
- end
- end
- % In the local interpolate condition. F is generated in this block
- % slide loop. hist3dbig should be cleared in each loop.
- if glbalinter == 0
- if or == 2
- hist3dbig(:,:,2) = hist3dbig(:,:,2)...
- + hist3dbig(:,:,nthet+2);
- hist3dbig(:,:,(nthet+1)) =...
- hist3dbig(:,:,(nthet+1)) + hist3dbig(:,:,1);
- end
- hist3d = hist3dbig(2:(nblockh+1), 2:(nblockw+1), 2:(nthet+1));
- for ibin = 1:nblockh
- for jbin = 1:nblockw
- idsF = nthet*((ibin-1)*nblockw+jbin-1)+1;
- idsF = idsF:(idsF+nthet-1);
- sF(idsF) = hist3d(ibin,jbin,:);
- end
- end
- iblock = ((btly-1)/ybstride)*ntotalbw +...
- ((btlx-1)/xbstride) + 1;
- idF = (iblock-1)*nblockw*nblockh*nthet+1;
- idF = idF:(idF+nblockw*nblockh*nthet-1);
- F(idF) = sF;
- hist3dbig(:,:,:) = 0;
- end
- end
- end
- % In the global interpolate condition. F is generated here outside the
- % block slide loop
- if glbalinter == 1
- ncellx = N / cellpw;
- ncelly = M / cellph;
- if or == 2
- hist3dbig(:,:,2) = hist3dbig(:,:,2) + hist3dbig(:,:,nthet+2);
- hist3dbig(:,:,(nthet+1)) = hist3dbig(:,:,(nthet+1)) + hist3dbig(:,:,1);
- end
- hist3d = hist3dbig(2:(ncelly+1), 2:(ncellx+1), 2:(nthet+1));
- iblock = 1;
- for btly = 1:ybstride:ybstridend
- for btlx = 1:xbstride:xbstridend
- binidx = floor((btlx-1)/cellpw)+1;
- binidy = floor((btly-1)/cellph)+1;
- idF = (iblock-1)*nblockw*nblockh*nthet+1;
- idF = idF:(idF+nblockw*nblockh*nthet-1);
- for ibin = 1:nblockh
- for jbin = 1:nblockw
- idsF = nthet*((ibin-1)*nblockw+jbin-1)+1;
- idsF = idsF:(idsF+nthet-1);
- sF(idsF) = hist3d(binidy+ibin-1, binidx+jbin-1, :);
- end
- end
- F(idF) = sF;
- iblock = iblock + 1;
- end
- end
- end
- % adjust the negative value caused by accuracy of floating-point
- % operations.these value's scale is very small, usually at E-03 magnitude
- % while others will be E+02 or E+03 before normalization.
- F(F<0) = 0;
- % block normalization.
- e = 0.001;
- l2hysthreshold = 0.2;
- fslidestep = nblockw*nblockh*nthet;
- switch normmethod
- case 'none'
- case 'l1'
- for fi = 1:fslidestep:size(F,2)
- div = sum(F(fi:(fi+fslidestep-1)));
- F(fi:(fi+fslidestep-1)) = F(fi:(fi+fslidestep-1))/(div+e);
- end
- case 'l1sqrt'
- for fi = 1:fslidestep:size(F,2)
- div = sum(F(fi:(fi+fslidestep-1)));
- F(fi:(fi+fslidestep-1)) = sqrt(F(fi:(fi+fslidestep-1))/(div+e));
- end
- case 'l2'
- for fi = 1:fslidestep:size(F,2)
- sF = F(fi:(fi+fslidestep-1)).*F(fi:(fi+fslidestep-1));
- div = sqrt(sum(sF)+e*e);
- F(fi:(fi+fslidestep-1)) = F(fi:(fi+fslidestep-1))/div;
- end
- case 'l2hys'
- for fi = 1:fslidestep:size(F,2)
- sF = F(fi:(fi+fslidestep-1)).*F(fi:(fi+fslidestep-1));
- div = sqrt(sum(sF)+e*e);
- sF = F(fi:(fi+fslidestep-1))/div;
- sF(sF>l2hysthreshold) = l2hysthreshold;
- div = sqrt(sum(sF.*sF)+e*e);
- F(fi:(fi+fslidestep-1)) = sF/div;
- end
- otherwise
- error('Incorrect NORMMETHOD parameter.');
- end
- % the following code, which can be removed because of having no
- % contributions to HOG feature calculation, are just for result
- % demonstration when the trilinear interpolation is 'global' for this
- % condition is easier to give a simple and intuitive illustration. so in
- % 'local' condition it will produce error.
- % figure;
- % hold on;
- % axis equal;
- % xlim([0, N]);
- % ylim([0, M]);
- % for u = 1:(M/cellph)
- % for v = 1:(N/cellpw)
- % cx = (v-1)*cellpw + cellpw/2 + 0.5;
- % cy = (u-1)*cellph + cellph/2 + 0.5;
- % hist3d(u,v,:)=0.9*min(cellpw,cellph)*hist3d(u,v,:)/max(hist3d(u,v,:));
- % for t = 1:nthet
- % s = hist3d(u,v,t);
- % thet = (t-1)*pi/nthet + pi*0.5/nthet;
- % x1 = cx - s*0.5*cos(thet);
- % x2 = cx + s*0.5*cos(thet);
- % y1 = cy - s*0.5*sin(thet);
- % y2 = cy + s*0.5*sin(thet);
- % plot([x1,x2],[M-y1+1,M-y2+1]);
- % end
- % end
- % end
function F = hogcalculator(img, cellpw, cellph, nblockw, nblockh,...
nthet, overlap, isglobalinterpolate, issigned, normmethod)
% HOGCALCULATOR calculate R-HOG feature vector of an input image using the
% procedure presented in Dalal and Triggs's paper in CVPR 2005.
%
% Author: timeHandle
% Time: March 24, 2010
% May 12,2010 update.
%
% this copy of code is written for my personal interest, which is an
% original and inornate realization of [Dalal CVPR2005]'s algorithm
% without any optimization. I just want to check whether I understand
% the algorithm really or not, and also do some practices for knowing
% matlab programming more well because I could be called as 'novice'.
% OpenCV 2.0 has realized Dalal's HOG algorithm which runs faster
% than mine without any doubt, ╮(╯▽╰)╭ . Ronan pointed a error in
% the code,thanks for his correction. Note that at the end of this
% code, there are some demonstration code,please remove in your work.
%
% F = hogcalculator(img, cellpw, cellph, nblockw, nblockh,
% nthet, overlap, isglobalinterpolate, issigned, normmethod)
%
% IMG:
% IMG is the input image.
%
% CELLPW, CELLPH:
% CELLPW and CELLPH are cell's pixel width and height respectively.
%
% NBLOCKW, NBLCOKH:
% NBLOCKW and NBLCOKH are block size counted by cells number in x and
% y directions respectively.
%
% NTHET, ISSIGNED:
% NTHET is the number of the bins of the histogram of oriented
% gradient. The histogram of oriented gradient ranges from 0 to pi in
% 'unsigned' condition while to 2*pi in 'signed' condition, which can
% be specified through setting the value of the variable ISSIGNED by
% the string 'unsigned' or 'signed'.
%
% OVERLAP:
% OVERLAP is the overlap proportion of two neighboring block.
%
% ISGLOBALINTERPOLATE:
% ISGLOBALINTERPOLATE specifies whether the trilinear interpolation(三线性插值)
% is done in a single global 3d histogram of the whole detecting
% window by the string 'globalinterpolate' or in each local 3d
% histogram corresponding to respective blocks by the string
% 'localinterpolate' which is in strict accordance with the procedure
% proposed in Dalal's paper. Interpolating in the whole detecting
% window requires the block's sliding step to be an integral multiple
% of cell's width and height because the histogram is fixing before
% interpolate. In fact here the so called 'global interpolation' is
% a notation given by myself. at first the spatial interpolation is
% done without any relevant to block's slide position, but when I was
% doing calculation while OVERLAP is 0.75, something occurred and
% confused me o__O"… . This let me find that the operation I firstly
% did is different from which mentioned in Dalal's paper. But this
% does not mean it is incorrect ^◎^, so I reserve this. As for name,
% besides 'global interpolate', any others would be all ok, like 'Lady GaGa'
% or what else, :-).
%
% NORMMETHOD:
% NORMMETHOD is the block histogram normalized method which can be
% set as one of the following strings:
% 'none', which means non-normalization;
% 'l1', which means L1-norm normalization;
% 'l2', which means L2-norm normalization;
% 'l1sqrt', which means L1-sqrt-norm normalization;
% 'l2hys', which means L2-hys-norm normalization.
% F:
% F is a row vector storing the final histogram of all of the blocks
% one by one in a top-left to bottom-right image scan manner, the
% cells histogram are stored in the same manner in each block's
% section of F.
%
% Note that CELLPW*NBLOCKW and CELLPH*NBLOCKH should be equal to IMG's
% width and height respectively.
%
% Here is a demonstration, which all of parameters are set as the
% best value mentioned in Dalal's paper when the window detected is 128*64
% size(128 rows, 64 columns):
% F = hogcalculator(window, 8, 8, 2, 2, 9, 0.5,
% 'localinterpolate', 'unsigned', 'l2hys');
% Also the function can be called like:
% F = hogcalculator(window);
% the other parameters are all set by using the above-mentioned "dalal's
% best value" as default.
%
if nargin < 2
% set default parameters value.
cellpw = 2;
cellph = 2;
nblockw = 2;
nblockh = 2;
nthet = 9;
overlap = 0.5;
isglobalinterpolate = 'localinterpolate';
issigned = 'unsigned';
normmethod = 'l2hys';
else
if nargin < 10
error('Input parameters are not enough.');
end
end
% check parameters's validity.
[M, N, K] = size(img);
if mod(M,cellph*nblockh) ~= 0
error('IMG''s height should be an integral multiple of CELLPH*NBLOCKH.');
end
if mod(N,cellpw*nblockw) ~= 0
error('IMG''s width should be an integral multiple of CELLPW*NBLOCKW.');
end
if mod((1-overlap)*cellpw*nblockw, cellpw) ~= 0 ||...
mod((1-overlap)*cellph*nblockh, cellph) ~= 0
str1 = 'Incorrect OVERLAP or ISGLOBALINTERPOLATE parameter';
str2 = ', slide step should be an intergral multiple of cell size';
error([str1, str2]);
end
% set the standard deviation of gaussian spatial weight window.
delta = cellpw*nblockw * 0.5;
% calculate gradient scale matrix.
hx = [-1,0,1];
hy = -hx';
gradscalx = imfilter(double(img),hx);
gradscaly = imfilter(double(img),hy);
if K > 1
gradscalx = max(max(gradscalx(:,:,1),gradscalx(:,:,2)), gradscalx(:,:,3));
gradscaly = max(max(gradscaly(:,:,1),gradscaly(:,:,2)), gradscaly(:,:,3));
end
gradscal = sqrt(double(gradscalx.*gradscalx + gradscaly.*gradscaly));
% calculate gradient orientation matrix.
% plus small number for avoiding dividing zero.
gradscalxplus = gradscalx+ones(size(gradscalx))*0.0001;
gradorient = zeros(M,N);
% unsigned situation: orientation region is 0 to pi.
if strcmp(issigned, 'unsigned') == 1
gradorient =...
atan(gradscaly./gradscalxplus) + pi/2;
or = 1;
else
% signed situation: orientation region is 0 to 2*pi.
if strcmp(issigned, 'signed') == 1
idx = find(gradscalx >= 0 & gradscaly >= 0);
gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx));
idx = find(gradscalx < 0);
gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx)) + pi;
idx = find(gradscalx >= 0 & gradscaly < 0);
gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx)) + 2*pi;
or = 2;
else
error('Incorrect ISSIGNED parameter.');
end
end
% calculate block slide step.
xbstride = cellpw*nblockw*(1-overlap);
ybstride = cellph*nblockh*(1-overlap);
xbstridend = N - cellpw*nblockw + 1;
ybstridend = M - cellph*nblockh + 1;
% calculate the total blocks number in the window detected, which is
% ntotalbh*ntotalbw.
ntotalbh = ((M-cellph*nblockh)/ybstride)+1;
ntotalbw = ((N-cellpw*nblockw)/xbstride)+1;
% generate the matrix hist3dbig for storing the 3-dimensions histogram. the
% matrix covers the whole image in the 'globalinterpolate' condition or
% covers the local block in the 'localinterpolate' condition. The matrix is
% bigger than the area where it covers by adding additional elements
% (corresponding to the cells) to the surround for calculation convenience.
if strcmp(isglobalinterpolate, 'globalinterpolate') == 1
ncellx = N / cellpw;
ncelly = M / cellph;
hist3dbig = zeros(ncelly+2, ncellx+2, nthet+2);
F = zeros(1, (M/cellph-1)*(N/cellpw-1)*nblockw*nblockh*nthet);
glbalinter = 1;
else
if strcmp(isglobalinterpolate, 'localinterpolate') == 1
hist3dbig = zeros(nblockh+2, nblockw+2, nthet+2);
F = zeros(1, ntotalbh*ntotalbw*nblockw*nblockh*nthet);
glbalinter = 0;
else
error('Incorrect ISGLOBALINTERPOLATE parameter.')
end
end
% generate the matrix for storing histogram of one block;
sF = zeros(1, nblockw*nblockh*nthet);
% generate gaussian spatial weight.
[gaussx, gaussy] = meshgrid(0:(cellpw*nblockw-1), 0:(cellph*nblockh-1));
weight = exp(-((gaussx-(cellpw*nblockw-1)/2)...
.*(gaussx-(cellpw*nblockw-1)/2)+(gaussy-(cellph*nblockh-1)/2)...
.*(gaussy-(cellph*nblockh-1)/2))/(delta*delta));
% vote for histogram. there are two situations according to the interpolate
% condition('global' interpolate or local interpolate). The hist3d which is
% generated from the 'bigger' matrix hist3dbig is the final histogram.
if glbalinter == 1
xbstep = nblockw*cellpw;
ybstep = nblockh*cellph;
else
xbstep = xbstride;
ybstep = ybstride;
end
% block slide loop
for btly = 1:ybstep:ybstridend
for btlx = 1:xbstep:xbstridend
for bi = 1:(cellph*nblockh)
for bj = 1:(cellpw*nblockw)
i = btly + bi - 1;
j = btlx + bj - 1;
gaussweight = weight(bi,bj);
gs = gradscal(i,j);
go = gradorient(i,j);
if glbalinter == 1
jorbj = j;
iorbi = i;
else
jorbj = bj;
iorbi = bi;
end
% calculate bin index of hist3dbig
binx1 = floor((jorbj-1+cellpw/2)/cellpw) + 1;
biny1 = floor((iorbi-1+cellph/2)/cellph) + 1;
binz1 = floor((go+(or*pi/nthet)/2)/(or*pi/nthet)) + 1;
if gs == 0
continue;
end
binx2 = binx1 + 1;
biny2 = biny1 + 1;
binz2 = binz1 + 1;
x1 = (binx1-1.5)*cellpw + 0.5;
y1 = (biny1-1.5)*cellph + 0.5;
z1 = (binz1-1.5)*(or*pi/nthet);
% trilinear interpolation.
hist3dbig(biny1,binx1,binz1) =...
hist3dbig(biny1,binx1,binz1) + gs*gaussweight...
* (1-(jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...
*(1-(go-z1)/(or*pi/nthet));
hist3dbig(biny1,binx1,binz2) =...
hist3dbig(biny1,binx1,binz2) + gs*gaussweight...
* (1-(jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...
*((go-z1)/(or*pi/nthet));
hist3dbig(biny2,binx1,binz1) =...
hist3dbig(biny2,binx1,binz1) + gs*gaussweight...
* (1-(jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...
*(1-(go-z1)/(or*pi/nthet));
hist3dbig(biny2,binx1,binz2) =...
hist3dbig(biny2,binx1,binz2) + gs*gaussweight...
* (1-(jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...
*((go-z1)/(or*pi/nthet));
hist3dbig(biny1,binx2,binz1) =...
hist3dbig(biny1,binx2,binz1) + gs*gaussweight...
* ((jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...
*(1-(go-z1)/(or*pi/nthet));
hist3dbig(biny1,binx2,binz2) =...
hist3dbig(biny1,binx2,binz2) + gs*gaussweight...
* ((jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...
*((go-z1)/(or*pi/nthet));
hist3dbig(biny2,binx2,binz1) =...
hist3dbig(biny2,binx2,binz1) + gs*gaussweight...
* ((jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...
*(1-(go-z1)/(or*pi/nthet));
hist3dbig(biny2,binx2,binz2) =...
hist3dbig(biny2,binx2,binz2) + gs*gaussweight...
* ((jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...
*((go-z1)/(or*pi/nthet));
end
end
% In the local interpolate condition. F is generated in this block
% slide loop. hist3dbig should be cleared in each loop.
if glbalinter == 0
if or == 2
hist3dbig(:,:,2) = hist3dbig(:,:,2)...
+ hist3dbig(:,:,nthet+2);
hist3dbig(:,:,(nthet+1)) =...
hist3dbig(:,:,(nthet+1)) + hist3dbig(:,:,1);
end
hist3d = hist3dbig(2:(nblockh+1), 2:(nblockw+1), 2:(nthet+1));
for ibin = 1:nblockh
for jbin = 1:nblockw
idsF = nthet*((ibin-1)*nblockw+jbin-1)+1;
idsF = idsF:(idsF+nthet-1);
sF(idsF) = hist3d(ibin,jbin,:);
end
end
iblock = ((btly-1)/ybstride)*ntotalbw +...
((btlx-1)/xbstride) + 1;
idF = (iblock-1)*nblockw*nblockh*nthet+1;
idF = idF:(idF+nblockw*nblockh*nthet-1);
F(idF) = sF;
hist3dbig(:,:,:) = 0;
end
end
end
% In the global interpolate condition. F is generated here outside the
% block slide loop
if glbalinter == 1
ncellx = N / cellpw;
ncelly = M / cellph;
if or == 2
hist3dbig(:,:,2) = hist3dbig(:,:,2) + hist3dbig(:,:,nthet+2);
hist3dbig(:,:,(nthet+1)) = hist3dbig(:,:,(nthet+1)) + hist3dbig(:,:,1);
end
hist3d = hist3dbig(2:(ncelly+1), 2:(ncellx+1), 2:(nthet+1));
iblock = 1;
for btly = 1:ybstride:ybstridend
for btlx = 1:xbstride:xbstridend
binidx = floor((btlx-1)/cellpw)+1;
binidy = floor((btly-1)/cellph)+1;
idF = (iblock-1)*nblockw*nblockh*nthet+1;
idF = idF:(idF+nblockw*nblockh*nthet-1);
for ibin = 1:nblockh
for jbin = 1:nblockw
idsF = nthet*((ibin-1)*nblockw+jbin-1)+1;
idsF = idsF:(idsF+nthet-1);
sF(idsF) = hist3d(binidy+ibin-1, binidx+jbin-1, :);
end
end
F(idF) = sF;
iblock = iblock + 1;
end
end
end
% adjust the negative value caused by accuracy of floating-point
% operations.these value's scale is very small, usually at E-03 magnitude
% while others will be E+02 or E+03 before normalization.
F(F<0) = 0;
% block normalization.
e = 0.001;
l2hysthreshold = 0.2;
fslidestep = nblockw*nblockh*nthet;
switch normmethod
case 'none'
case 'l1'
for fi = 1:fslidestep:size(F,2)
div = sum(F(fi:(fi+fslidestep-1)));
F(fi:(fi+fslidestep-1)) = F(fi:(fi+fslidestep-1))/(div+e);
end
case 'l1sqrt'
for fi = 1:fslidestep:size(F,2)
div = sum(F(fi:(fi+fslidestep-1)));
F(fi:(fi+fslidestep-1)) = sqrt(F(fi:(fi+fslidestep-1))/(div+e));
end
case 'l2'
for fi = 1:fslidestep:size(F,2)
sF = F(fi:(fi+fslidestep-1)).*F(fi:(fi+fslidestep-1));
div = sqrt(sum(sF)+e*e);
F(fi:(fi+fslidestep-1)) = F(fi:(fi+fslidestep-1))/div;
end
case 'l2hys'
for fi = 1:fslidestep:size(F,2)
sF = F(fi:(fi+fslidestep-1)).*F(fi:(fi+fslidestep-1));
div = sqrt(sum(sF)+e*e);
sF = F(fi:(fi+fslidestep-1))/div;
sF(sF>l2hysthreshold) = l2hysthreshold;
div = sqrt(sum(sF.*sF)+e*e);
F(fi:(fi+fslidestep-1)) = sF/div;
end
otherwise
error('Incorrect NORMMETHOD parameter.');
end
% the following code, which can be removed because of having no
% contributions to HOG feature calculation, are just for result
% demonstration when the trilinear interpolation is 'global' for this
% condition is easier to give a simple and intuitive illustration. so in
% 'local' condition it will produce error.
% figure;
% hold on;
% axis equal;
% xlim([0, N]);
% ylim([0, M]);
% for u = 1:(M/cellph)
% for v = 1:(N/cellpw)
% cx = (v-1)*cellpw + cellpw/2 + 0.5;
% cy = (u-1)*cellph + cellph/2 + 0.5;
% hist3d(u,v,:)=0.9*min(cellpw,cellph)*hist3d(u,v,:)/max(hist3d(u,v,:));
% for t = 1:nthet
% s = hist3d(u,v,t);
% thet = (t-1)*pi/nthet + pi*0.5/nthet;
% x1 = cx - s*0.5*cos(thet);
% x2 = cx + s*0.5*cos(thet);
% y1 = cy - s*0.5*sin(thet);
% y2 = cy + s*0.5*sin(thet);
% plot([x1,x2],[M-y1+1,M-y2+1]);
% end
% end
% end