Histograms of Oriented Gradients (HOG)理解和源码
2010年6月1日 丕子 发表评论 阅读评论 282 阅读 
HOG descriptors 是应用在计算机视觉和图像处理领域,用于目标检测的特征描述器。这项技术是用来计算局部图像梯度的方向信息的统计值。这种方法跟边缘方向直方图(edge orientation histograms)、尺度不变特征变换(scale-invariant feature transform descriptors) 以及形状上下文方法( shape contexts)有很多相似之处,但与它们的不同点是:HOG描述器是在一个网格密集的大小统一的细胞单元(dense grid of uniformly spaced cells)上计算,而且为了提高性能,还采用了重叠的局部对比度归一化(overlapping local contrast normalization)技术。
这篇文章的作者Navneet Dalal和Bill Triggs是法国国家计算机技术和控制研究所French National Institute for Research in Computer Science and Control (INRIA)的研究员。他们在这篇文章中首次提出了HOG方法。这篇文章被发表在2005年的CVPR上。他们主要是将这种方法应用在静态图像中的行人 检测上,但在后来,他们也将其应用在电影和视频中的行人检测,以及静态图像中的车辆和常见动物的检测。
HOG描述器最重要的思想是:在一副 图像中,局部目标的表象和形状(appearance and shape)能够被梯度或边缘的方向密度分布很好地描述。具体的实现方法是:首先将图像分成小的连通区域,我们把它叫细胞单元。然后采集细胞单元中各像素 点的梯度的或边缘的方向直方图。最后把这些直方图组合起来就可以构成特征描述器。为了提高性能,我们还可以把这些局部直方图在图像的更大的范围内(我们把 它叫区间或block)进行对比度归一化(contrast-normalized),所采用的方法是:先计算各直方图在这个区间(block)中的密 度,然后根据这个密度对区间中的各个细胞单元做归一化。通过这个归一化后,能对光照变化和阴影获得更好的效果。
与其他的特征描述方法相 比,HOG描述器后很多优点。首先,由于HOG方法是在图像的局部细胞单元上操作,所以它对图像几何的(geometric)和光学的 (photometric)形变都能保持很好的不变性,这两种形变只会出现在更大的空间领域上。其次,作者通过实验发现,在粗的空域抽样(coarse spatial sampling)、精细的方向抽样(fine orientation sampling)以及较强的局部光学归一化(strong local photometric normalization)等条件下,只要行人大体上能够保持直立的姿势,就容许行人有一些细微的肢体动作,这些细微的动作可以被忽略而不影响检测效 果。综上所述,HOG方法是特别适合于做图像中的行人检测的。
上图是作者做的行人检测试验,其中(a)表示所有训练图像集 的平均梯度(average gradient across their training images);(b)和(c)分别表示:图像中每一个区间(block)上的最大最大正、负SVM权值;(d)表示一副测试图像;(e)计算完R- HOG后的测试图像;(f)和(g)分别表示被正、负SVM权值加权后的R-HOG图像。
算法的实现:
色彩和伽马归一化 (color and gamma normalization)
作者分别在灰度空间、RGB色彩空间和LAB色彩空间上对图像进行色彩和 伽马归一化,但实验结果显示,这个归一化的预处理工作对最后的结果没有影响,原因可能是:在后续步骤中也有归一化的过程,那些过程可以取代这个预处理的归 一化。所以,在实际应用中,这一步可以省略。
梯度的计算(Gradient computation)
最常用的方法是:简单 地使用一个一维的离散微分模板(1-D centered point discrete derivative mask)在一个方向上或者同时在水平和垂直两个方向上对图像进行处理,更确切地说,这个方法需要使用下面的滤波器核滤除图像中的色彩或变化剧烈的数据 (color or intensity data)
作者也尝试了其他一些更复杂的模板,如3×3 Sobel 模板,或对角线模板(diagonal masks),但是在这个行人检测的实验中,这些复杂模板的表现都较差,所以作者的结论是:模板越简单,效果反而越好。作者也尝试了在使用微分模板前加入 一个高斯平滑滤波,但是这个高斯平滑滤波的加入使得检测效果更差,原因是:许多有用的图像信息是来自变化剧烈的边缘,而在计算梯度之前加入高斯滤波会把这 些边缘滤除掉。
构建方向的直方图(creating the orientation histograms)
第三步就是为 图像的每个细胞单元构建梯度方向直方图。细胞单元中的每一个像素点都为某个基于方向的直方图通道(orientation-based histogram channel)投票。投票是采取加权投票(weighted voting)的方式,即每一票都是带权值的,这个权值是根据该像素点的梯度幅度计算出来。可以采用幅值本身或者它的函数来表示这个权值,实际测试表明: 使用幅值来表示权值能获得最佳的效果,当然,也可以选择幅值的函数来表示,比如幅值的平方根(square root)、幅值的平方(square of the gradient magnitude)、幅值的截断形式(clipped version of the magnitude)等。细胞单元可以是矩形的(rectangular),也可以是星形的(radial)。直方图通道是平均分布在0-1800(无 向)或0-3600(有向)范围内。作者发现,采用无向的梯度和9个直方图通道,能在行人检测试验中取得最佳的效果。
把细胞单元组 合成大的区间(grouping the cells together into larger blocks)
由于局部光照的变化 (variations of illumination)以及前景-背景对比度(foreground-background contrast)的变化,使得梯度强度(gradient strengths)的变化范围非常大。这就需要对梯度强度做归一化,作者采取的办法是:把各个细胞单元组合成大的、空间上连通的区间(blocks)。 这样以来,HOG描述器就变成了由各区间所有细胞单元的直方图成分所组成的一个向量。这些区间是互有重叠的,这就意味着:每一个细胞单元的输出都多次作用 于最终的描述器。区间有两个主要的几何形状——矩形区间(R-HOG)和环形区间(C-HOG)。R-HOG区间大体上是一些方形的格子,它可以有三个参 数来表征:每个区间中细胞单元的数目、每个细胞单元中像素点的数目、每个细胞的直方图通道数目。作者通过实验表明,行人检测的最佳参数设置是:3×3细胞 /区间、6×6像素/细胞、9个直方图通道。作者还发现,在对直方图做处理之前,给每个区间(block)加一个高斯空域窗口(Gaussian spatial window)是非常必要的,因为这样可以降低边缘的周围像素点(pixels around the edge)的权重。
R- HOG跟SIFT描述器看起来很相似,但他们的不同之处是:R-HOG是在单一尺度下、密集的网格内、没有对方向排序的情况下被计算出来(are computed in dense grids at some single scale without orientation alignment);而SIFT描述器是在多尺度下、稀疏的图像关键点上、对方向排序的情况下被计算出来(are computed at sparse scale-invariant key image points and are rotated to align orientation)。补充一点,R-HOG是各区间被组合起来用于对空域信息进行编码(are used in conjunction to encode spatial form information),而SIFT的各描述器是单独使用的(are used singly)。
C- HOG区间(blocks)有两种不同的形式,它们的区别在于:一个的中心细胞是完整的,一个的中心细胞是被分割的。如右图所示:
作者发现 C-HOG的这两种形式都能取得相同的效果。C-HOG区间(blocks)可以用四个参数来表征:角度盒子的个数(number of angular bins)、半径盒子个数(number of radial bins)、中心盒子的半径(radius of the center bin)、半径的伸展因子(expansion factor for the radius)。通过实验,对于行人检测,最佳的参数设置为:4个角度盒子、2个半径盒子、中心盒子半径为4个像素、伸展因子为2。前面提到过,对于R- HOG,中间加一个高斯空域窗口是非常有必要的,但对于C-HOG,这显得没有必要。C-HOG看起来很像基于形状上下文(Shape Contexts)的方法,但不同之处是:C-HOG的区间中包含的细胞单元有多个方向通道(orientation channels),而基于形状上下文的方法仅仅只用到了一个单一的边缘存在数(edge presence count)。
区间归一化 (Block normalization)
作者采用了四中不同的方法对区间进行归一化,并对结果进行了比较。引入v表示一个还没有被归一 化的向量,它包含了给定区间(block)的所有直方图信息。| | vk | |表示v的k阶范数,这里的k去1、2。用e表示一个很小的常数。这时,归一化因子可以表示如下:
L2-norm:
L1-norm:
L1-sqrt:
还 有第四种归一化方式:L2-Hys,它可以通过先进行L2-norm,对结果进行截短(clipping),然后再重新归一化得到。作者发现:采用L2- Hys L2-norm 和 L1-sqrt方式所取得的效果是一样的,L1-norm稍微表现出一点点不可靠性。但是对于没有被归一化的数据来说,这四种方法都表现出来显着的改进。
SVM 分类器(SVM classifier)
最后一步就是把提取的HOG特征输入到SVM分类器中,寻找一个最优超平面作为决策函数。作者采用 的方法是:使用免费的SVMLight软件包加上HOG分类器来寻找测试图像中的行人。
Matlab源码:
查看源代码打印帮助001 function F = hogcalculator(img, cellpw, cellph, nblockw, nblockh,...  
002     nthet, overlap, isglobalinterpolate, issigned, normmethod)  
003 % HOGCALCULATOR calculate R-HOG feature vector of an input image using the  
004 % procedure presented in Dalal and Triggs's paper in CVPR 2005.  
005 %  
006    
007 % Author:   timeHandle  
008 % Time:     March 24, 2010  
009 %           May 12,2010 update.  
010 %  
011 %       this copy of code is written for my personal interest, which is an  
012 %       original and inornate realization of [Dalal CVPR2005]'s algorithm  
013 %       without any optimization. I just want to check whether I understand  
014 %       the algorithm really or not, and also do some practices for knowing  
015 %       matlab programming more well because I could be called as 'novice'.  
016 %       OpenCV 2.0 has realized Dalal's HOG algorithm which runs faster  
017 %       than mine without any doubt, ╮(╯▽╰)╭ . Ronan pointed a error in  
018 %       the code,thanks for his correction. Note that at the end of this  
019 %       code, there are some demonstration code,please remove in your work.  
020    
021 %  
022 % F = hogcalculator(img, cellpw, cellph, nblockw, nblockh,  
023 %    nthet, overlap, isglobalinterpolate, issigned, normmethod)  
024 %  
025 % IMG:  
026 %       IMG is the input image.  
027 %  
028 % CELLPW, CELLPH:  
029 %       CELLPW and CELLPH are cell's pixel width and height respectively.  
030 %  
031 % NBLOCKW, NBLCOKH:  
032 %       NBLOCKW and NBLCOKH are block size counted by cells number in x and  
033 %       y directions respectively.  
034 %  
035 % NTHET, ISSIGNED:  
036 %       NTHET is the number of the bins of the histogram of oriented  
037 %       gradient. The histogram of oriented gradient ranges from 0 to pi in  
038 %       'unsigned' condition while to 2*pi in 'signed' condition, which can  
039 %       be specified through setting the value of the variable ISSIGNED by  
040 %       the string 'unsigned' or 'signed'.  
041 %  
042 % OVERLAP:  
043 %       OVERLAP is the overlap proportion of two neighboring block.  
044 %  
045 % ISGLOBALINTERPOLATE:  
046 %       ISGLOBALINTERPOLATE specifies whether the trilinear interpolation  
047 %       is done in a single global 3d histogram of the whole detecting  
048 %       window by the string 'globalinterpolate' or in each local 3d  
049 %       histogram corresponding to respective blocks by the string  
050 %       'localinterpolate' which is in strict accordance with the procedure  
051 %       proposed in Dalal's paper. Interpolating in the whole detecting  
052 %       window requires the block's sliding step to be an integral multiple  
053 %       of cell's width and height because the histogram is fixing before  
054 %       interpolate. In fact here the so called 'global interpolation' is  
055 %       a notation given by myself. at first the spatial interpolation is  
056 %       done without any relevant to block's slide position, but when I was  
057 %       doing calculation while OVERLAP is 0.75, something occurred and  
058 %       confused me o__O"… . This let me find that the operation I firstly  
059 %       did is different from which mentioned in Dalal's paper. But this  
060 %       does not mean it is incorrect ^◎^, so I reserve this. As for name,  
061 %       besides 'global interpolate', any others would be all ok, like 'Lady GaGa'  
062 %       or what else, :-).  
063 %  
064 % NORMMETHOD:  
065 %       NORMMETHOD is the block histogram normalized method which can be  
066 %       set as one of the following strings:  
067 %               'none', which means non-normalization;  
068 %               'l1', which means L1-norm normalization;  
069 %               'l2', which means L2-norm normalization;  
070 %               'l1sqrt', which means L1-sqrt-norm normalization;  
071 %               'l2hys', which means L2-hys-norm normalization.  
072 % F:  
073 %       F is a row vector storing the final histogram of all of the blocks  
074 %       one by one in a top-left to bottom-right image scan manner, the  
075 %       cells histogram are stored in the same manner in each block's  
076 %       section of F.  
077 %  
078 % Note that CELLPW*NBLOCKW and CELLPH*NBLOCKH should be equal to IMG's  
079 % width and height respectively.  
080 %  
081 % Here is a demonstration, which all of parameters are set as the  
082 % best value mentioned in Dalal's paper when the window detected is 128*64  
083 % size(128 rows, 64 columns):  
084 %       F = hogcalculator(window, 8, 8, 2, 2, 9, 0.5,  
085 %                               'localinterpolate', 'unsigned', 'l2hys');  
086 % Also the function can be called like:  
087 %       F = hogcalculator(window);  
088 % the other parameters are all set by using the above-mentioned "dalal's  
089 % best value" as default.  
090 %  
091    
092 if nargin < 2  
093     % set default parameters value.  
094     cellpw = 8;  
095     cellph = 8;  
096     nblockw = 2;  
097     nblockh = 2;  
098     nthet = 9;  
099     overlap = 0.5;  
100     isglobalinterpolate = 'localinterpolate';  
101     issigned = 'unsigned';  
102     normmethod = 'l2hys';  
103 else 
104     if nargin < 10  
105         error('Input parameters are not enough.');  
106     end 
107 end 
108    
109 % check parameters's validity.  
110 [M, N, K] = size(img);  
111 if mod(M,cellph*nblockh) ~= 0  
112     error('IMG''s height should be an integral multiple of CELLPH*NBLOCKH.');  
113 end 
114 if mod(N,cellpw*nblockw) ~= 0  
115     error('IMG''s width should be an integral multiple of CELLPW*NBLOCKW.');  
116 end 
117 if mod((1-overlap)*cellpw*nblockw, cellpw) ~= 0 ||...  
118         mod((1-overlap)*cellph*nblockh, cellph) ~= 0  
119     str1 = 'Incorrect OVERLAP or ISGLOBALINTERPOLATE parameter';  
120     str2 = ', slide step should be an intergral multiple of cell size';  
121     error([str1, str2]);  
122 end 
123    
124 % set the standard deviation of gaussian spatial weight window.  
125 delta = cellpw*nblockw * 0.5;  
126    
127 % calculate gradient scale matrix.  
128 hx = [-1,0,1];  
129 hy = -hx';  
130 gradscalx = imfilter(double(img),hx);  
131 gradscaly = imfilter(double(img),hy);  
132 if K > 1  
133     gradscalx = max(max(gradscalx(:,:,1),gradscalx(:,:,2)), gradscalx(:,:,3));  
134     gradscaly = max(max(gradscaly(:,:,1),gradscaly(:,:,2)), gradscaly(:,:,3));  
135 end 
136 gradscal = sqrt(double(gradscalx.*gradscalx + gradscaly.*gradscaly));  
137    
138 % calculate gradient orientation matrix.  
139 % plus small number for avoiding dividing zero.  
140 gradscalxplus = gradscalx+ones(size(gradscalx))*0.0001;  
141 gradorient = zeros(M,N);  
142 % unsigned situation: orientation region is 0 to pi.  
143 if strcmp(issigned, 'unsigned') == 1  
144     gradorient =...  
145         atan(gradscaly./gradscalxplus) + pi/2;  
146     or = 1;  
147 else 
148     % signed situation: orientation region is 0 to 2*pi.  
149     if strcmp(issigned, 'signed') == 1  
150         idx = find(gradscalx >= 0 & gradscaly >= 0);  
151         gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx));  
152         idx = find(gradscalx < 0);  
153         gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx)) + pi;  
154         idx = find(gradscalx >= 0 & gradscaly < 0);  
155         gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx)) + 2*pi;  
156         or = 2;  
157     else 
158         error('Incorrect ISSIGNED parameter.');  
159     end 
160 end 
161    
162 % calculate block slide step.  
163 xbstride = cellpw*nblockw*(1-overlap);  
164 ybstride = cellph*nblockh*(1-overlap);  
165 xbstridend = N - cellpw*nblockw + 1;  
166 ybstridend = M - cellph*nblockh + 1;  
167    
168 % calculate the total blocks number in the window detected, which is  
169 % ntotalbh*ntotalbw.  
170 ntotalbh = ((M-cellph*nblockh)/ybstride)+1;  
171 ntotalbw = ((N-cellpw*nblockw)/xbstride)+1;  
172    
173 % generate the matrix hist3dbig for storing the 3-dimensions histogram. the  
174 % matrix covers the whole image in the 'globalinterpolate' condition or  
175 % covers the local block in the 'localinterpolate' condition. The matrix is  
176 % bigger than the area where it covers by adding additional elements  
177 % (corresponding to the cells) to the surround for calculation convenience.  
178 if strcmp(isglobalinterpolate, 'globalinterpolate') == 1  
179     ncellx = N / cellpw;  
180     ncelly = M / cellph;  
181     hist3dbig = zeros(ncelly+2, ncellx+2, nthet+2);  
182     F = zeros(1, (M/cellph-1)*(N/cellpw-1)*nblockw*nblockh*nthet);  
183     glbalinter = 1;  
184 else 
185     if strcmp(isglobalinterpolate, 'localinterpolate') == 1  
186         hist3dbig = zeros(nblockh+2, nblockw+2, nthet+2);  
187         F = zeros(1, ntotalbh*ntotalbw*nblockw*nblockh*nthet);  
188         glbalinter = 0;  
189     else 
190         error('Incorrect ISGLOBALINTERPOLATE parameter.')  
191     end 
192 end 
193    
194 % generate the matrix for storing histogram of one block;  
195 sF = zeros(1, nblockw*nblockh*nthet);  
196    
197 % generate gaussian spatial weight.  
198 [gaussx, gaussy] = meshgrid(0:(cellpw*nblockw-1), 0:(cellph*nblockh-1));  
199 weight = exp(-((gaussx-(cellpw*nblockw-1)/2)...  
200     .*(gaussx-(cellpw*nblockw-1)/2)+(gaussy-(cellph*nblockh-1)/2)...  
201     .*(gaussy-(cellph*nblockh-1)/2))/(delta*delta));  
202    
203 % vote for histogram. there are two situations according to the interpolate  
204 % condition('global' interpolate or local interpolate). The hist3d which is  
205 % generated from the 'bigger' matrix hist3dbig is the final histogram.  
206 if glbalinter == 1  
207     xbstep = nblockw*cellpw;  
208     ybstep = nblockh*cellph;  
209 else 
210     xbstep = xbstride;  
211     ybstep = ybstride;  
212 end 
213 % block slide loop  
214 for btly = 1:ybstep:ybstridend  
215     for btlx = 1:xbstep:xbstridend  
216         for bi = 1:(cellph*nblockh)  
217             for bj = 1:(cellpw*nblockw)  
218    
219                 i = btly + bi - 1;  
220                 j = btlx + bj - 1;  
221                 gaussweight = weight(bi,bj);  
222    
223                 gs = gradscal(i,j);  
224                 go = gradorient(i,j);  
225    
226                 if glbalinter == 1  
227                     jorbj = j;  
228                     iorbi = i;  
229                 else 
230                     jorbj = bj;  
231                     iorbi = bi;  
232                 end 
233    
234                 % calculate bin index of hist3dbig  
235                 binx1 = floor((jorbj-1+cellpw/2)/cellpw) + 1;  
236                 biny1 = floor((iorbi-1+cellph/2)/cellph) + 1;  
237                 binz1 = floor((go+(or*pi/nthet)/2)/(or*pi/nthet)) + 1;  
238    
239                 if gs == 0  
240                     continue;  
241                 end 
242    
243                 binx2 = binx1 + 1;  
244                 biny2 = biny1 + 1;  
245                 binz2 = binz1 + 1;  
246    
247                 x1 = (binx1-1.5)*cellpw + 0.5;  
248                 y1 = (biny1-1.5)*cellph + 0.5;  
249                 z1 = (binz1-1.5)*(or*pi/nthet);  
250    
251                 % trilinear interpolation.  
252                 hist3dbig(biny1,binx1,binz1) =...  
253                     hist3dbig(biny1,binx1,binz1) + gs*gaussweight...  
254                     * (1-(jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...  
255                     *(1-(go-z1)/(or*pi/nthet));  
256                 hist3dbig(biny1,binx1,binz2) =...  
257                     hist3dbig(biny1,binx1,binz2) + gs*gaussweight...  
258                     * (1-(jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...  
259                     *((go-z1)/(or*pi/nthet));  
260                 hist3dbig(biny2,binx1,binz1) =...  
261                     hist3dbig(biny2,binx1,binz1) + gs*gaussweight...  
262                     * (1-(jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...  
263                     *(1-(go-z1)/(or*pi/nthet));  
264                 hist3dbig(biny2,binx1,binz2) =...  
265                     hist3dbig(biny2,binx1,binz2) + gs*gaussweight...  
266                     * (1-(jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...  
267                     *((go-z1)/(or*pi/nthet));  
268                 hist3dbig(biny1,binx2,binz1) =...  
269                     hist3dbig(biny1,binx2,binz1) + gs*gaussweight...  
270                     * ((jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...  
271                     *(1-(go-z1)/(or*pi/nthet));  
272                 hist3dbig(biny1,binx2,binz2) =...  
273                     hist3dbig(biny1,binx2,binz2) + gs*gaussweight...  
274                     * ((jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)...  
275                     *((go-z1)/(or*pi/nthet));  
276                 hist3dbig(biny2,binx2,binz1) =...  
277                     hist3dbig(biny2,binx2,binz1) + gs*gaussweight...  
278                     * ((jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...  
279                     *(1-(go-z1)/(or*pi/nthet));  
280                 hist3dbig(biny2,binx2,binz2) =...  
281                     hist3dbig(biny2,binx2,binz2) + gs*gaussweight...  
282                     * ((jorbj-x1)/cellpw)*((iorbi-y1)/cellph)...  
283                     *((go-z1)/(or*pi/nthet));  
284             end 
285         end 
286    
287         % In the local interpolate condition. F is generated in this block  
288         % slide loop. hist3dbig should be cleared in each loop.  
289         if glbalinter == 0  
290             if or == 2  
291                 hist3dbig(:,:,2) = hist3dbig(:,:,2)...  
292                     + hist3dbig(:,:,nthet+2);  
293                 hist3dbig(:,:,(nthet+1)) =...  
294                     hist3dbig(:,:,(nthet+1)) + hist3dbig(:,:,1);  
295             end 
296             hist3d = hist3dbig(2:(nblockh+1), 2:(nblockw+1), 2:(nthet+1));  
297             for ibin = 1:nblockh  
298                 for jbin = 1:nblockw  
299                     idsF = nthet*((ibin-1)*nblockw+jbin-1)+1;  
300                     idsF = idsF:(idsF+nthet-1);  
301                     sF(idsF) = hist3d(ibin,jbin,:);  
302                 end 
303             end 
304             iblock = ((btly-1)/ybstride)*ntotalbw +...  
305                 ((btlx-1)/xbstride) + 1;  
306             idF = (iblock-1)*nblockw*nblockh*nthet+1;  
307             idF = idF:(idF+nblockw*nblockh*nthet-1);  
308             F(idF) = sF;  
309             hist3dbig(:,:,:) = 0;  
310         end 
311     end 
312 end 
313    
314 % In the global interpolate condition. F is generated here outside the  
315 % block slide loop  
316 if glbalinter == 1  
317     ncellx = N / cellpw;  
318     ncelly = M / cellph;  
319     if or == 2  
320         hist3dbig(:,:,2) = hist3dbig(:,:,2) + hist3dbig(:,:,nthet+2);  
321         hist3dbig(:,:,(nthet+1)) = hist3dbig(:,:,(nthet+1)) + hist3dbig(:,:,1);  
322     end 
323     hist3d = hist3dbig(2:(ncelly+1), 2:(ncellx+1), 2:(nthet+1));  
324    
325     iblock = 1;  
326     for btly = 1:ybstride:ybstridend  
327         for btlx = 1:xbstride:xbstridend  
328             binidx = floor((btlx-1)/cellpw)+1;  
329             binidy = floor((btly-1)/cellph)+1;  
330             idF = (iblock-1)*nblockw*nblockh*nthet+1;  
331             idF = idF:(idF+nblockw*nblockh*nthet-1);  
332             for ibin = 1:nblockh  
333                 for jbin = 1:nblockw  
334                     idsF = nthet*((ibin-1)*nblockw+jbin-1)+1;  
335                     idsF = idsF:(idsF+nthet-1);  
336                     sF(idsF) = hist3d(binidy+ibin-1, binidx+jbin-1,: );  
337                 end 
338             end 
339             F(idF) = sF;  
340             iblock = iblock + 1;  
341         end 
342     end 
343 end 
344    
345 % adjust the negative value caused by accuracy of floating-point  
346 % operations.these value's scale is very small, usually at E-03 magnitude  
347 % while others will be E+02 or E+03 before normalization.  
348 F(F<0) = 0;  
349    
350 % block normalization.  
351 e = 0.001;  
352 l2hysthreshold = 0.2;  
353 fslidestep = nblockw*nblockh*nthet;  
354 switch normmethod  
355     case 'none' 
356     case 'l1' 
357         for fi = 1:fslidestep:size(F,2)  
358             div = sum(F(fi:(fi+fslidestep-1)));  
359             F(fi:(fi+fslidestep-1)) = F(fi:(fi+fslidestep-1))/(div+e);  
360         end 
361     case 'l1sqrt' 
362         for fi = 1:fslidestep:size(F,2)  
363             div = sum(F(fi:(fi+fslidestep-1)));  
364             F(fi:(fi+fslidestep-1)) = sqrt(F(fi:(fi+fslidestep-1))/(div+e));  
365         end 
366     case 'l2' 
367         for fi = 1:fslidestep:size(F,2)  
368             sF = F(fi:(fi+fslidestep-1)).*F(fi:(fi+fslidestep-1));  
369             div = sqrt(sum(sF)+e*e);  
370             F(fi:(fi+fslidestep-1)) = F(fi:(fi+fslidestep-1))/div;  
371         end 
372     case 'l2hys' 
373         for fi = 1:fslidestep:size(F,2)  
374             sF = F(fi:(fi+fslidestep-1)).*F(fi:(fi+fslidestep-1));  
375             div = sqrt(sum(sF)+e*e);  
376             sF = F(fi:(fi+fslidestep-1))/div;  
377             sF(sF>l2hysthreshold) = l2hysthreshold;  
378             div = sqrt(sum(sF.*sF)+e*e);  
379             F(fi:(fi+fslidestep-1)) = sF/div;  
380         end 
381     otherwise 
382         error('Incorrect NORMMETHOD parameter.');  
383 end 
384    
385 % the following code, which can be removed because of having no  
386 % contributions to HOG feature calculation, are just for result  
387 % demonstration when the trilinear interpolation is 'global' for this  
388 % condition is easier to give a simple and intuitive illustration. so in  
389 % 'local' condition it will produce error.  
390 figure;  
391 hold on;  
392 axis equal;  
393 xlim([0, N]);  
394 ylim([0, M]);  
395 for u = 1:(M/cellph)  
396     for v = 1:(N/cellpw)  
397         cx = (v-1)*cellpw + cellpw/2 + 0.5;  
398         cy = (u-1)*cellph + cellph/2 + 0.5;  
399         hist3d(u,v,:)=0.9*min(cellpw,cellph)*hist3d(u,v,:)/max(hist3d(u,v,:));  
400         for t = 1:nthet  
401             s = hist3d(u,v,t);  
402             thet = (t-1)*pi/nthet + pi*0.5/nthet;  
403             x1 = cx - s*0.5*cos(thet);  
404             x2 = cx + s*0.5*cos(thet);  
405             y1 = cy - s*0.5*sin(thet);  
406             y2 = cy + s*0.5*sin(thet);  
407             plot([x1,x2],[M-y1+1,M-y2+1]);  
408         end 
409     end 
410 end 
                
        Histograms of Oriented Gradients (HOG)理解和源码
最新推荐文章于 2020-06-15 21:14:00 发布
          
          
       
          
       
      
HOG特征是一种用于目标检测的有效描述符,尤其适用于行人检测。它通过计算图像中小区域的梯度方向直方图来捕捉局部目标外观和形状信息。本文详细介绍了HOG特征的原理、计算流程及应用。
          
                  
                  
                  
                  
      
          
                
                
                
                
              
                
                
                
                
                
              
                
                
              
            
                  
					1366
					
被折叠的  条评论
		 为什么被折叠?
		 
		 
		
    
  
    
  
            


            