Histograms of Oriented Gradients (HOG)理解和源码

最新推荐文章于 2020-06-15 21:14:00 发布

转载最新推荐文章于 2020-06-15 21:14:00 发布 · 1.6k 阅读

文章标签：

#algorithm #matlab #parameters #alignment #测试 #vector

HOG特征是一种用于目标检测的有效描述符，尤其适用于行人检测。它通过计算图像中小区域的梯度方向直方图来捕捉局部目标外观和形状信息。本文详细介绍了HOG特征的原理、计算流程及应用。

Histograms of Oriented Gradients (HOG)理解和源码 2010年6月1日丕子发表评论阅读评论 282 阅读 HOG descriptors 是应用在计算机视觉和图像处理领域，用于目标检测的特征描述器。这项技术是用来计算局部图像梯度的方向信息的统计值。这种方法跟边缘方向直方图（edge orientation histograms）、尺度不变特征变换（scale-invariant feature transform descriptors）以及形状上下文方法（ shape contexts）有很多相似之处，但与它们的不同点是：HOG描述器是在一个网格密集的大小统一的细胞单元（dense grid of uniformly spaced cells）上计算，而且为了提高性能，还采用了重叠的局部对比度归一化（overlapping local contrast normalization）技术。这篇文章的作者Navneet Dalal和Bill Triggs是法国国家计算机技术和控制研究所French National Institute for Research in Computer Science and Control (INRIA)的研究员。他们在这篇文章中首次提出了HOG方法。这篇文章被发表在2005年的CVPR上。他们主要是将这种方法应用在静态图像中的行人检测上，但在后来，他们也将其应用在电影和视频中的行人检测，以及静态图像中的车辆和常见动物的检测。 HOG描述器最重要的思想是：在一副图像中，局部目标的表象和形状（appearance and shape）能够被梯度或边缘的方向密度分布很好地描述。具体的实现方法是：首先将图像分成小的连通区域，我们把它叫细胞单元。然后采集细胞单元中各像素点的梯度的或边缘的方向直方图。最后把这些直方图组合起来就可以构成特征描述器。为了提高性能，我们还可以把这些局部直方图在图像的更大的范围内（我们把它叫区间或block）进行对比度归一化（contrast-normalized），所采用的方法是：先计算各直方图在这个区间（block）中的密度，然后根据这个密度对区间中的各个细胞单元做归一化。通过这个归一化后，能对光照变化和阴影获得更好的效果。与其他的特征描述方法相比，HOG描述器后很多优点。首先，由于HOG方法是在图像的局部细胞单元上操作，所以它对图像几何的（geometric）和光学的（photometric）形变都能保持很好的不变性，这两种形变只会出现在更大的空间领域上。其次，作者通过实验发现，在粗的空域抽样（coarse spatial sampling）、精细的方向抽样（fine orientation sampling）以及较强的局部光学归一化（strong local photometric normalization）等条件下，只要行人大体上能够保持直立的姿势，就容许行人有一些细微的肢体动作，这些细微的动作可以被忽略而不影响检测效果。综上所述，HOG方法是特别适合于做图像中的行人检测的。上图是作者做的行人检测试验，其中（a）表示所有训练图像集的平均梯度（average gradient across their training images）；（b）和（c）分别表示：图像中每一个区间（block）上的最大最大正、负SVM权值；（d）表示一副测试图像；（e）计算完R- HOG后的测试图像；（f）和（g）分别表示被正、负SVM权值加权后的R-HOG图像。算法的实现：色彩和伽马归一化（color and gamma normalization）作者分别在灰度空间、RGB色彩空间和LAB色彩空间上对图像进行色彩和伽马归一化，但实验结果显示，这个归一化的预处理工作对最后的结果没有影响，原因可能是：在后续步骤中也有归一化的过程，那些过程可以取代这个预处理的归一化。所以，在实际应用中，这一步可以省略。梯度的计算（Gradient computation）最常用的方法是：简单地使用一个一维的离散微分模板（1-D centered point discrete derivative mask）在一个方向上或者同时在水平和垂直两个方向上对图像进行处理，更确切地说，这个方法需要使用下面的滤波器核滤除图像中的色彩或变化剧烈的数据（color or intensity data）作者也尝试了其他一些更复杂的模板，如3×3 Sobel 模板，或对角线模板（diagonal masks），但是在这个行人检测的实验中，这些复杂模板的表现都较差，所以作者的结论是：模板越简单，效果反而越好。作者也尝试了在使用微分模板前加入一个高斯平滑滤波，但是这个高斯平滑滤波的加入使得检测效果更差，原因是：许多有用的图像信息是来自变化剧烈的边缘，而在计算梯度之前加入高斯滤波会把这些边缘滤除掉。构建方向的直方图（creating the orientation histograms）第三步就是为图像的每个细胞单元构建梯度方向直方图。细胞单元中的每一个像素点都为某个基于方向的直方图通道（orientation-based histogram channel）投票。投票是采取加权投票（weighted voting）的方式，即每一票都是带权值的，这个权值是根据该像素点的梯度幅度计算出来。可以采用幅值本身或者它的函数来表示这个权值，实际测试表明：使用幅值来表示权值能获得最佳的效果，当然，也可以选择幅值的函数来表示，比如幅值的平方根（square root）、幅值的平方（square of the gradient magnitude）、幅值的截断形式（clipped version of the magnitude）等。细胞单元可以是矩形的（rectangular），也可以是星形的（radial）。直方图通道是平均分布在0-1800（无向）或0-3600（有向）范围内。作者发现，采用无向的梯度和9个直方图通道，能在行人检测试验中取得最佳的效果。把细胞单元组合成大的区间（grouping the cells together into larger blocks）由于局部光照的变化（variations of illumination）以及前景-背景对比度（foreground-background contrast）的变化，使得梯度强度（gradient strengths）的变化范围非常大。这就需要对梯度强度做归一化，作者采取的办法是：把各个细胞单元组合成大的、空间上连通的区间（blocks）。这样以来，HOG描述器就变成了由各区间所有细胞单元的直方图成分所组成的一个向量。这些区间是互有重叠的，这就意味着：每一个细胞单元的输出都多次作用于最终的描述器。区间有两个主要的几何形状——矩形区间（R-HOG）和环形区间（C-HOG）。R-HOG区间大体上是一些方形的格子，它可以有三个参数来表征：每个区间中细胞单元的数目、每个细胞单元中像素点的数目、每个细胞的直方图通道数目。作者通过实验表明，行人检测的最佳参数设置是：3×3细胞 /区间、6×6像素/细胞、9个直方图通道。作者还发现，在对直方图做处理之前，给每个区间（block）加一个高斯空域窗口（Gaussian spatial window）是非常必要的，因为这样可以降低边缘的周围像素点（pixels around the edge）的权重。 R- HOG跟SIFT描述器看起来很相似，但他们的不同之处是：R-HOG是在单一尺度下、密集的网格内、没有对方向排序的情况下被计算出来（are computed in dense grids at some single scale without orientation alignment）；而SIFT描述器是在多尺度下、稀疏的图像关键点上、对方向排序的情况下被计算出来（are computed at sparse scale-invariant key image points and are rotated to align orientation）。补充一点，R-HOG是各区间被组合起来用于对空域信息进行编码（are used in conjunction to encode spatial form information），而SIFT的各描述器是单独使用的（are used singly）。 C- HOG区间（blocks）有两种不同的形式，它们的区别在于：一个的中心细胞是完整的，一个的中心细胞是被分割的。如右图所示：作者发现 C-HOG的这两种形式都能取得相同的效果。C-HOG区间（blocks）可以用四个参数来表征：角度盒子的个数（number of angular bins）、半径盒子个数（number of radial bins）、中心盒子的半径（radius of the center bin）、半径的伸展因子（expansion factor for the radius）。通过实验，对于行人检测，最佳的参数设置为：4个角度盒子、2个半径盒子、中心盒子半径为4个像素、伸展因子为2。前面提到过，对于R- HOG，中间加一个高斯空域窗口是非常有必要的，但对于C-HOG，这显得没有必要。C-HOG看起来很像基于形状上下文（Shape Contexts）的方法，但不同之处是：C-HOG的区间中包含的细胞单元有多个方向通道（orientation channels），而基于形状上下文的方法仅仅只用到了一个单一的边缘存在数（edge presence count）。区间归一化（Block normalization）作者采用了四中不同的方法对区间进行归一化，并对结果进行了比较。引入v表示一个还没有被归一化的向量，它包含了给定区间（block）的所有直方图信息。| | vk | |表示v的k阶范数，这里的k去1、2。用e表示一个很小的常数。这时，归一化因子可以表示如下： L2-norm: L1-norm: L1-sqrt: 还有第四种归一化方式：L2-Hys，它可以通过先进行L2-norm，对结果进行截短（clipping），然后再重新归一化得到。作者发现：采用L2- Hys L2-norm 和 L1-sqrt方式所取得的效果是一样的，L1-norm稍微表现出一点点不可靠性。但是对于没有被归一化的数据来说，这四种方法都表现出来显着的改进。 SVM 分类器（SVM classifier）最后一步就是把提取的HOG特征输入到SVM分类器中，寻找一个最优超平面作为决策函数。作者采用的方法是：使用免费的SVMLight软件包加上HOG分类器来寻找测试图像中的行人。 Matlab源码：查看源代码打印帮助001 function F = hogcalculator(img, cellpw, cellph, nblockw, nblockh,... 002 nthet, overlap, isglobalinterpolate, issigned, normmethod) 003 % HOGCALCULATOR calculate R-HOG feature vector of an input image using the 004 % procedure presented in Dalal and Triggs's paper in CVPR 2005. 005 % 006 007 % Author: timeHandle 008 % Time: March 24, 2010 009 % May 12，2010 update. 010 % 011 % this copy of code is written for my personal interest, which is an 012 % original and inornate realization of [Dalal CVPR2005]'s algorithm 013 % without any optimization. I just want to check whether I understand 014 % the algorithm really or not, and also do some practices for knowing 015 % matlab programming more well because I could be called as 'novice'. 016 % OpenCV 2.0 has realized Dalal's HOG algorithm which runs faster 017 % than mine without any doubt, ╮(╯▽╰)╭ . Ronan pointed a error in 018 % the code，thanks for his correction. Note that at the end of this 019 % code, there are some demonstration code，please remove in your work. 020 021 % 022 % F = hogcalculator(img, cellpw, cellph, nblockw, nblockh, 023 % nthet, overlap, isglobalinterpolate, issigned, normmethod) 024 % 025 % IMG: 026 % IMG is the input image. 027 % 028 % CELLPW, CELLPH: 029 % CELLPW and CELLPH are cell's pixel width and height respectively. 030 % 031 % NBLOCKW, NBLCOKH: 032 % NBLOCKW and NBLCOKH are block size counted by cells number in x and 033 % y directions respectively. 034 % 035 % NTHET, ISSIGNED: 036 % NTHET is the number of the bins of the histogram of oriented 037 % gradient. The histogram of oriented gradient ranges from 0 to pi in 038 % 'unsigned' condition while to 2*pi in 'signed' condition, which can 039 % be specified through setting the value of the variable ISSIGNED by 040 % the string 'unsigned' or 'signed'. 041 % 042 % OVERLAP: 043 % OVERLAP is the overlap proportion of two neighboring block. 044 % 045 % ISGLOBALINTERPOLATE: 046 % ISGLOBALINTERPOLATE specifies whether the trilinear interpolation 047 % is done in a single global 3d histogram of the whole detecting 048 % window by the string 'globalinterpolate' or in each local 3d 049 % histogram corresponding to respective blocks by the string 050 % 'localinterpolate' which is in strict accordance with the procedure 051 % proposed in Dalal's paper. Interpolating in the whole detecting 052 % window requires the block's sliding step to be an integral multiple 053 % of cell's width and height because the histogram is fixing before 054 % interpolate. In fact here the so called 'global interpolation' is 055 % a notation given by myself. at first the spatial interpolation is 056 % done without any relevant to block's slide position, but when I was 057 % doing calculation while OVERLAP is 0.75, something occurred and 058 % confused me o__O"… . This let me find that the operation I firstly 059 % did is different from which mentioned in Dalal's paper. But this 060 % does not mean it is incorrect ^◎^, so I reserve this. As for name, 061 % besides 'global interpolate', any others would be all ok, like 'Lady GaGa' 062 % or what else, :-). 063 % 064 % NORMMETHOD： 065 % NORMMETHOD is the block histogram normalized method which can be 066 % set as one of the following strings: 067 % 'none', which means non-normalization; 068 % 'l1', which means L1-norm normalization; 069 % 'l2', which means L2-norm normalization; 070 % 'l1sqrt', which means L1-sqrt-norm normalization; 071 % 'l2hys', which means L2-hys-norm normalization. 072 % F： 073 % F is a row vector storing the final histogram of all of the blocks 074 % one by one in a top-left to bottom-right image scan manner, the 075 % cells histogram are stored in the same manner in each block's 076 % section of F. 077 % 078 % Note that CELLPW*NBLOCKW and CELLPH*NBLOCKH should be equal to IMG's 079 % width and height respectively. 080 % 081 % Here is a demonstration, which all of parameters are set as the 082 % best value mentioned in Dalal's paper when the window detected is 128*64 083 % size(128 rows, 64 columns): 084 % F = hogcalculator(window, 8, 8, 2, 2, 9, 0.5, 085 % 'localinterpolate', 'unsigned', 'l2hys'); 086 % Also the function can be called like: 087 % F = hogcalculator(window); 088 % the other parameters are all set by using the above-mentioned "dalal's 089 % best value" as default. 090 % 091 092 if nargin < 2 093 % set default parameters value. 094 cellpw = 8; 095 cellph = 8; 096 nblockw = 2; 097 nblockh = 2; 098 nthet = 9; 099 overlap = 0.5; 100 isglobalinterpolate = 'localinterpolate'; 101 issigned = 'unsigned'; 102 normmethod = 'l2hys'; 103 else 104 if nargin < 10 105 error('Input parameters are not enough.'); 106 end 107 end 108 109 % check parameters's validity. 110 [M, N, K] = size(img); 111 if mod(M,cellph*nblockh) ~= 0 112 error('IMG''s height should be an integral multiple of CELLPH*NBLOCKH.'); 113 end 114 if mod(N,cellpw*nblockw) ~= 0 115 error('IMG''s width should be an integral multiple of CELLPW*NBLOCKW.'); 116 end 117 if mod((1-overlap)*cellpw*nblockw, cellpw) ~= 0 ||... 118 mod((1-overlap)*cellph*nblockh, cellph) ~= 0 119 str1 = 'Incorrect OVERLAP or ISGLOBALINTERPOLATE parameter'; 120 str2 = ', slide step should be an intergral multiple of cell size'; 121 error([str1, str2]); 122 end 123 124 % set the standard deviation of gaussian spatial weight window. 125 delta = cellpw*nblockw * 0.5; 126 127 % calculate gradient scale matrix. 128 hx = [-1,0,1]; 129 hy = -hx'; 130 gradscalx = imfilter(double(img),hx); 131 gradscaly = imfilter(double(img),hy); 132 if K > 1 133 gradscalx = max(max(gradscalx(:,:,1),gradscalx(:,:,2)), gradscalx(:,:,3)); 134 gradscaly = max(max(gradscaly(:,:,1),gradscaly(:,:,2)), gradscaly(:,:,3)); 135 end 136 gradscal = sqrt(double(gradscalx.*gradscalx + gradscaly.*gradscaly)); 137 138 % calculate gradient orientation matrix. 139 % plus small number for avoiding dividing zero. 140 gradscalxplus = gradscalx+ones(size(gradscalx))*0.0001; 141 gradorient = zeros(M,N); 142 % unsigned situation: orientation region is 0 to pi. 143 if strcmp(issigned, 'unsigned') == 1 144 gradorient =... 145 atan(gradscaly./gradscalxplus) + pi/2; 146 or = 1; 147 else 148 % signed situation: orientation region is 0 to 2*pi. 149 if strcmp(issigned, 'signed') == 1 150 idx = find(gradscalx >= 0 & gradscaly >= 0); 151 gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx)); 152 idx = find(gradscalx < 0); 153 gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx)) + pi; 154 idx = find(gradscalx >= 0 & gradscaly < 0); 155 gradorient(idx) = atan(gradscaly(idx)./gradscalxplus(idx)) + 2*pi; 156 or = 2; 157 else 158 error('Incorrect ISSIGNED parameter.'); 159 end 160 end 161 162 % calculate block slide step. 163 xbstride = cellpw*nblockw*(1-overlap); 164 ybstride = cellph*nblockh*(1-overlap); 165 xbstridend = N - cellpw*nblockw + 1; 166 ybstridend = M - cellph*nblockh + 1; 167 168 % calculate the total blocks number in the window detected, which is 169 % ntotalbh*ntotalbw. 170 ntotalbh = ((M-cellph*nblockh)/ybstride)+1; 171 ntotalbw = ((N-cellpw*nblockw)/xbstride)+1; 172 173 % generate the matrix hist3dbig for storing the 3-dimensions histogram. the 174 % matrix covers the whole image in the 'globalinterpolate' condition or 175 % covers the local block in the 'localinterpolate' condition. The matrix is 176 % bigger than the area where it covers by adding additional elements 177 % (corresponding to the cells) to the surround for calculation convenience. 178 if strcmp(isglobalinterpolate, 'globalinterpolate') == 1 179 ncellx = N / cellpw; 180 ncelly = M / cellph; 181 hist3dbig = zeros(ncelly+2, ncellx+2, nthet+2); 182 F = zeros(1, (M/cellph-1)*(N/cellpw-1)*nblockw*nblockh*nthet); 183 glbalinter = 1; 184 else 185 if strcmp(isglobalinterpolate, 'localinterpolate') == 1 186 hist3dbig = zeros(nblockh+2, nblockw+2, nthet+2); 187 F = zeros(1, ntotalbh*ntotalbw*nblockw*nblockh*nthet); 188 glbalinter = 0; 189 else 190 error('Incorrect ISGLOBALINTERPOLATE parameter.') 191 end 192 end 193 194 % generate the matrix for storing histogram of one block; 195 sF = zeros(1, nblockw*nblockh*nthet); 196 197 % generate gaussian spatial weight. 198 [gaussx, gaussy] = meshgrid(0:(cellpw*nblockw-1), 0:(cellph*nblockh-1)); 199 weight = exp(-((gaussx-(cellpw*nblockw-1)/2)... 200 .*(gaussx-(cellpw*nblockw-1)/2)+(gaussy-(cellph*nblockh-1)/2)... 201 .*(gaussy-(cellph*nblockh-1)/2))/(delta*delta)); 202 203 % vote for histogram. there are two situations according to the interpolate 204 % condition('global' interpolate or local interpolate). The hist3d which is 205 % generated from the 'bigger' matrix hist3dbig is the final histogram. 206 if glbalinter == 1 207 xbstep = nblockw*cellpw; 208 ybstep = nblockh*cellph; 209 else 210 xbstep = xbstride; 211 ybstep = ybstride; 212 end 213 % block slide loop 214 for btly = 1:ybstep:ybstridend 215 for btlx = 1:xbstep:xbstridend 216 for bi = 1:(cellph*nblockh) 217 for bj = 1:(cellpw*nblockw) 218 219 i = btly + bi - 1; 220 j = btlx + bj - 1; 221 gaussweight = weight(bi,bj); 222 223 gs = gradscal(i,j); 224 go = gradorient(i,j); 225 226 if glbalinter == 1 227 jorbj = j; 228 iorbi = i; 229 else 230 jorbj = bj; 231 iorbi = bi; 232 end 233 234 % calculate bin index of hist3dbig 235 binx1 = floor((jorbj-1+cellpw/2)/cellpw) + 1; 236 biny1 = floor((iorbi-1+cellph/2)/cellph) + 1; 237 binz1 = floor((go+(or*pi/nthet)/2)/(or*pi/nthet)) + 1; 238 239 if gs == 0 240 continue; 241 end 242 243 binx2 = binx1 + 1; 244 biny2 = biny1 + 1; 245 binz2 = binz1 + 1; 246 247 x1 = (binx1-1.5)*cellpw + 0.5; 248 y1 = (biny1-1.5)*cellph + 0.5; 249 z1 = (binz1-1.5)*(or*pi/nthet); 250 251 % trilinear interpolation. 252 hist3dbig(biny1,binx1,binz1) =... 253 hist3dbig(biny1,binx1,binz1) + gs*gaussweight... 254 * (1-(jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)... 255 *(1-(go-z1)/(or*pi/nthet)); 256 hist3dbig(biny1,binx1,binz2) =... 257 hist3dbig(biny1,binx1,binz2) + gs*gaussweight... 258 * (1-(jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)... 259 *((go-z1)/(or*pi/nthet)); 260 hist3dbig(biny2,binx1,binz1) =... 261 hist3dbig(biny2,binx1,binz1) + gs*gaussweight... 262 * (1-(jorbj-x1)/cellpw)*((iorbi-y1)/cellph)... 263 *(1-(go-z1)/(or*pi/nthet)); 264 hist3dbig(biny2,binx1,binz2) =... 265 hist3dbig(biny2,binx1,binz2) + gs*gaussweight... 266 * (1-(jorbj-x1)/cellpw)*((iorbi-y1)/cellph)... 267 *((go-z1)/(or*pi/nthet)); 268 hist3dbig(biny1,binx2,binz1) =... 269 hist3dbig(biny1,binx2,binz1) + gs*gaussweight... 270 * ((jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)... 271 *(1-(go-z1)/(or*pi/nthet)); 272 hist3dbig(biny1,binx2,binz2) =... 273 hist3dbig(biny1,binx2,binz2) + gs*gaussweight... 274 * ((jorbj-x1)/cellpw)*(1-(iorbi-y1)/cellph)... 275 *((go-z1)/(or*pi/nthet)); 276 hist3dbig(biny2,binx2,binz1) =... 277 hist3dbig(biny2,binx2,binz1) + gs*gaussweight... 278 * ((jorbj-x1)/cellpw)*((iorbi-y1)/cellph)... 279 *(1-(go-z1)/(or*pi/nthet)); 280 hist3dbig(biny2,binx2,binz2) =... 281 hist3dbig(biny2,binx2,binz2) + gs*gaussweight... 282 * ((jorbj-x1)/cellpw)*((iorbi-y1)/cellph)... 283 *((go-z1)/(or*pi/nthet)); 284 end 285 end 286 287 % In the local interpolate condition. F is generated in this block 288 % slide loop. hist3dbig should be cleared in each loop. 289 if glbalinter == 0 290 if or == 2 291 hist3dbig(:,:,2) = hist3dbig(:,:,2)... 292 + hist3dbig(:,:,nthet+2); 293 hist3dbig(:,:,(nthet+1)) =... 294 hist3dbig(:,:,(nthet+1)) + hist3dbig(:,:,1); 295 end 296 hist3d = hist3dbig(2:(nblockh+1), 2:(nblockw+1), 2:(nthet+1)); 297 for ibin = 1:nblockh 298 for jbin = 1:nblockw 299 idsF = nthet*((ibin-1)*nblockw+jbin-1)+1; 300 idsF = idsF:(idsF+nthet-1); 301 sF(idsF) = hist3d(ibin,jbin,:); 302 end 303 end 304 iblock = ((btly-1)/ybstride)*ntotalbw +... 305 ((btlx-1)/xbstride) + 1; 306 idF = (iblock-1)*nblockw*nblockh*nthet+1; 307 idF = idF:(idF+nblockw*nblockh*nthet-1); 308 F(idF) = sF; 309 hist3dbig(:,:,:) = 0; 310 end 311 end 312 end 313 314 % In the global interpolate condition. F is generated here outside the 315 % block slide loop 316 if glbalinter == 1 317 ncellx = N / cellpw; 318 ncelly = M / cellph; 319 if or == 2 320 hist3dbig(:,:,2) = hist3dbig(:,:,2) + hist3dbig(:,:,nthet+2); 321 hist3dbig(:,:,(nthet+1)) = hist3dbig(:,:,(nthet+1)) + hist3dbig(:,:,1); 322 end 323 hist3d = hist3dbig(2:(ncelly+1), 2:(ncellx+1), 2:(nthet+1)); 324 325 iblock = 1; 326 for btly = 1:ybstride:ybstridend 327 for btlx = 1:xbstride:xbstridend 328 binidx = floor((btlx-1)/cellpw)+1; 329 binidy = floor((btly-1)/cellph)+1; 330 idF = (iblock-1)*nblockw*nblockh*nthet+1; 331 idF = idF:(idF+nblockw*nblockh*nthet-1); 332 for ibin = 1:nblockh 333 for jbin = 1:nblockw 334 idsF = nthet*((ibin-1)*nblockw+jbin-1)+1; 335 idsF = idsF:(idsF+nthet-1); 336 sF(idsF) = hist3d(binidy+ibin-1, binidx+jbin-1,: ); 337 end 338 end 339 F(idF) = sF; 340 iblock = iblock + 1; 341 end 342 end 343 end 344 345 % adjust the negative value caused by accuracy of floating-point 346 % operations.these value's scale is very small, usually at E-03 magnitude 347 % while others will be E+02 or E+03 before normalization. 348 F(F<0) = 0; 349 350 % block normalization. 351 e = 0.001; 352 l2hysthreshold = 0.2; 353 fslidestep = nblockw*nblockh*nthet; 354 switch normmethod 355 case 'none' 356 case 'l1' 357 for fi = 1:fslidestep:size(F,2) 358 div = sum(F(fi:(fi+fslidestep-1))); 359 F(fi:(fi+fslidestep-1)) = F(fi:(fi+fslidestep-1))/(div+e); 360 end 361 case 'l1sqrt' 362 for fi = 1:fslidestep:size(F,2) 363 div = sum(F(fi:(fi+fslidestep-1))); 364 F(fi:(fi+fslidestep-1)) = sqrt(F(fi:(fi+fslidestep-1))/(div+e)); 365 end 366 case 'l2' 367 for fi = 1:fslidestep:size(F,2) 368 sF = F(fi:(fi+fslidestep-1)).*F(fi:(fi+fslidestep-1)); 369 div = sqrt(sum(sF)+e*e); 370 F(fi:(fi+fslidestep-1)) = F(fi:(fi+fslidestep-1))/div; 371 end 372 case 'l2hys' 373 for fi = 1:fslidestep:size(F,2) 374 sF = F(fi:(fi+fslidestep-1)).*F(fi:(fi+fslidestep-1)); 375 div = sqrt(sum(sF)+e*e); 376 sF = F(fi:(fi+fslidestep-1))/div; 377 sF(sF>l2hysthreshold) = l2hysthreshold; 378 div = sqrt(sum(sF.*sF)+e*e); 379 F(fi:(fi+fslidestep-1)) = sF/div; 380 end 381 otherwise 382 error('Incorrect NORMMETHOD parameter.'); 383 end 384 385 % the following code, which can be removed because of having no 386 % contributions to HOG feature calculation, are just for result 387 % demonstration when the trilinear interpolation is 'global' for this 388 % condition is easier to give a simple and intuitive illustration. so in 389 % 'local' condition it will produce error. 390 figure; 391 hold on; 392 axis equal; 393 xlim([0, N]); 394 ylim([0, M]); 395 for u = 1:(M/cellph) 396 for v = 1:(N/cellpw) 397 cx = (v-1)*cellpw + cellpw/2 + 0.5; 398 cy = (u-1)*cellph + cellph/2 + 0.5; 399 hist3d(u,v,:)=0.9*min(cellpw,cellph)*hist3d(u,v,:)/max(hist3d(u,v,:)); 400 for t = 1:nthet 401 s = hist3d(u,v,t); 402 thet = (t-1)*pi/nthet + pi*0.5/nthet; 403 x1 = cx - s*0.5*cos(thet); 404 x2 = cx + s*0.5*cos(thet); 405 y1 = cy - s*0.5*sin(thet); 406 y2 = cy + s*0.5*sin(thet); 407 plot([x1,x2],[M-y1+1,M-y2+1]); 408 end 409 end 410 end