opencv 学习笔记-入门（22）之快速hog的减少取样实现-论文笔记

最新推荐文章于 2022-06-03 14:33:39 发布

wobuaishangdiao

最新推荐文章于 2022-06-03 14:33:39 发布

阅读量2k

点赞数

分类专栏： opencv算法学习文章标签： transformation each types c algorithm features

本文链接：https://blog.csdn.net/wobuaishangdiao/article/details/7751598

版权

opencv算法学习专栏收录该内容

47 篇文章 3 订阅

订阅专栏

[Pang, 2011] Y. Pang, Y. Yuan, X. Li, et al. Efficient HOG human detection [J]. Signal Processing, 2011, 91: 773-781. 阅读笔记

We call this method as cell-based trilinear interpolation.

To decrease the computation complexity, we develop a sub-cell based trilinear interpolation.

The proposed method avoids unimportant interpolation by omitting the gradients whose locations are far from the cell (sub-cell) in concern. Specifically, each cell (8×8 pixels) in a block (16×16 pixels) is divided into 4 sub-cells (there are 4×4 pixels in a sub-cell). We classify the 4 sub-cells into 3 types: corner sub-cell, inner sub-cell, and semi-inner sub-cell. As illustrated in Fig. 4, a block contains 4 cells: C₁, C₂, C₃, and C₄ (see Fig. 4(a)) and C_i contains 4 sub-cells: C_i1, C_i2, C_i3, and C_i4 (see Fig. 4(b)). Note from Fig. 4(c) that C₁₁, C₂₂, C₃₃, and C₄₄ are called corner sub-cells, C₁₄,C₂₃, C₄₁, andC₃₂ are called inner sub-cells, and C₁₂, C₂₁, C₂₄, C₄₂, C₄₃, C₃₄, C₃₁, and C₁₃ are called semi-inner sub-cells. In our algorithm the 3 types of sub-cells have different roles in computing the HOG features:

(1)

Because that the corner sub-cells are at the corners of a block and are far from the other 3 cells, the gradients in the corner sub-cells are merely used to compute the histogram corresponding to their own cells. That is, the gradients in C₁₁, C₂₂, C₃₃, and C₄₄ contribute only to the histograms of C₁, C₂, C₃, and C₄, respectively.

(2)

Because that the inner sub-cells are near to all the 4 cells, so we let the gradients in the inner sub-cells to contribute to the histograms of all the 4 cells.

(3)

Each semi-inner sub-cell in a cell is a neighbor of a unique cell. Therefore, the gradients in each semi-inner cell are involved in computing the histograms of its own cell and its neighbor cell. Take the semi-inner cell C₁₃ for example, C₁₃ is contained in C₁ and is a neighbor of C₃. So the gradients in C₁₃ are used for computing the histograms of C₁ and C₃, but they are independent to the histograms of C₂ and C₄.

Fig. 4. Sub-cell based representation: (a) a 16×16 block consisting of four 8×8 cells: C₁, C₂, C₃, and C₄. (b) Each 8×8 cell C_i is decomposed into 4 sub-cells: C_i1, C_i2, C_i3, and C_i4, which have 3 types: corner sub-cell, inner sub-cell, and semi sub-cell.

Let h₁ denote the HOG features of cell C₁. Mathematically loosing, the computation of h₁ is given by

3.3. Efficient implementation of pixel based interpolation

Now the question is how to compute the item hist(C_ij) of Eq. (6). A straightforward way is to utilize Eqs.(1) and (2) to calculate each bin of hist(C_ij). According to Eqs. (1) and (2), the contribution of gradient at location (x,y) to orientation θ₂ is given by

(8)

where the coefficients are

(9)

(10)

(11)

Generally, it is time-consuming to compute Eq. (8) because it has to calculate the coefficients.

To speed up the computation process, we propose to compute the coefficients a and b offline and save them to a look-up-table. Given (x,y) the coefficients a and b can be obtained by the look-up-table. Note that the coefficient c cannot be computed by the look-up-table trick because the value of θ(x,y) is continuous instead of discrete. Using the look-up-table, computing Eq. (8) requires only 4 multiplications and 1 addition.

It is worth noting that Wang et al. [27] proposed to estimate Eq. (8) by convolving the gradients at (x,y) with the following 7 by 7 kernel:

(12)

Throughout this paper we call this method “convolution method” [27]. The convolution takes 50 multiplications and 49 additions. Even with fast Fourier transformation (FFT) the process takes more operations than our proposed look-up-table method. But the convolution method can be integrated with the trick of integral image [16], so it is faster than the traditional HOG method [10].

三线插值实现代码：

//三线插值  //针对快速实现的方法
void calHogBlock(Mat& grad_ang, Point btl, float *votes)   //btl就是每个块每个块的头坐标
{
	static double *gaussmask = getGaussMask();
	
	int x = btl.x;
	int y = btl.y;
	int index_x, index_y, index_z;
	for (int i = 0; i < parm.block * parm.cell; i++)  //一个block 的一维的长度
	{
		for (index_x = 0; index_x < 2; index_x++)     //索引cell
		{
			if ((i - parm.mid[index_x]) < 0)          //要在那两个cell的右边
				break;
		}
		float *gpd = grad_ang.ptr<float>(x+i);        //在这个基础上的第几行
		double *spd = gaussmask + i*16; // i*parm.block*parm.cell; //每一行的第一个的指针
		for (int j = 0; j < parm.block * parm.cell; j++)  
		{
			float temp = gpd[(j + y) * 2];  //每个的grident的值
			float angle = gpd[(j + y) * 2 + 1];   //*2 表示通道数
			for (index_y = 0; index_y < 2; index_y++)  //cell 的列坐标
			{
				if ((j - parm.mid[index_y]) < 0)
					break;
			}
			for (index_z = 0; index_z < DIR; index_z++)  //维度
			{
				if ((angle - dirs[index_z]) < 0)    //找到他所在的那个区间，然后进行投影
					break;   //angle比那个小的一个，即在那个比angle大一个的那个区段上限
			}

			float weight = float(spd[j] * temp);  
			float x_interpolate = (float)(i-parm.mid[0]) / parm.cell;  //插值的权重
			float y_interpolate = (float)(j-parm.mid[0])/ parm.cell;   //插值的权重
			float z_interpolate = (float)(dirs[index_z] - angle) * DIR/PI;   //

			if (index_x == 0)   
			{
				if (index_y == 0)
				{
					if (index_z == 0)
					{
						votes[0] += weight;     //
					}
					else if(index_z == 9)
					{
						votes[8] += weight;
					}
					else
					{	
						votes[index_z-1] += weight * (1-z_interpolate); 
						votes[index_z] += weight * z_interpolate;
					}
				}
				else if (index_y == 1)
				{
					if (index_z == 0)
					{
						votes[0] += weight * y_interpolate;
						votes[DIR] += weight * (1 - y_interpolate);
					}
					else if(index_z == 9)
					{
						votes[8] += weight * y_interpolate;
						votes[DIR+8] += weight * (1 - y_interpolate);
					}
					else
					{
						votes[index_z-1] += weight * y_interpolate * (1-z_interpolate);
						votes[index_z] += weight * y_interpolate * z_interpolate;
						votes[DIR+index_z-1] += weight * (1 - y_interpolate) * (1 - z_interpolate);
						votes[DIR+index_z] += weight * (1 - y_interpolate) * z_interpolate;
					}
				}
				else   //index_y = 2
				{
					if (index_z == 0)
					{
						votes[DIR] += weight;
					}
					else if(index_z == 9)
					{
						votes[DIR+8] += weight;
					}
					else
					{
						votes[DIR+index_z-1] += weight * (1 - z_interpolate);
						votes[DIR+index_z] += weight * z_interpolate;
					}
				}
			}
			else if(index_x == 1)
			{
				if (index_y == 0)
				{
					if (index_z == 0)
					{
						votes[0] += weight * x_interpolate;
						votes[parm.block * DIR] += weight * (1 - x_interpolate);
					}
					else if(index_z == 9)
					{
						votes[8] += weight * x_interpolate;
						votes[parm.block * DIR + 8] += weight * (1 - x_interpolate);
					}
					else
					{
						votes[index_z-1] += weight * x_interpolate * (1-z_interpolate);
						votes[index_z] += weight * x_interpolate * z_interpolate;
						votes[parm.block * DIR + index_z-1] += weight * (1-x_interpolate) * (1-z_interpolate);
						votes[parm.block * DIR + index_z] += weight * (1-x_interpolate) * z_interpolate;
					}
				}
				else if (index_y == 1)
				{
					if (index_z == 0)
					{
						votes[0] += weight * x_interpolate * y_interpolate;
						votes[DIR] += weight * x_interpolate * (1-y_interpolate);
						votes[parm.block * DIR] += weight * (1-x_interpolate) * y_interpolate;
						votes[parm.block * DIR + DIR] += weight * (1-x_interpolate) * (1-y_interpolate);
					}
					else if(index_z == 9)
					{
						votes[8] += weight * x_interpolate * y_interpolate;
						votes[DIR + 8] += weight * x_interpolate * (1-y_interpolate);
						votes[parm.block * DIR + 8] += weight * (1-x_interpolate) * y_interpolate;
						votes[parm.block * DIR + DIR + 8] += weight * (1-x_interpolate) * (1-y_interpolate);
					}
					else
					{
						votes[index_z-1] += weight * x_interpolate * y_interpolate * (1-z_interpolate);
						votes[index_z] += weight * x_interpolate * y_interpolate * z_interpolate;
						votes[DIR+index_z-1] += weight * x_interpolate * (1-y_interpolate) * (1-z_interpolate);
						votes[DIR+index_z] += weight * x_interpolate * (1-y_interpolate) * z_interpolate;
						votes[parm.block * DIR+index_z-1] += weight * (1-x_interpolate) * y_interpolate * (1-z_interpolate);
						votes[parm.block * DIR+index_z] += weight * (1-x_interpolate) * y_interpolate * z_interpolate;
						votes[parm.block * DIR+DIR+index_z-1] += weight * (1-x_interpolate) * (1-y_interpolate) * (1-z_interpolate);
						votes[parm.block * DIR+DIR+index_z] += weight * (1-x_interpolate) * (1-y_interpolate) * z_interpolate;
					}
				}
				else
				{
					if (index_z == 0)
					{
						votes[DIR] += weight * x_interpolate;
						votes[parm.block * DIR + DIR] += weight * (1-x_interpolate);
					}
					else if(index_z == 9)
					{
						votes[8+DIR] += weight * x_interpolate;
						votes[parm.block * DIR+ DIR+8] += weight * (1-x_interpolate);
					}
					else
					{
						votes[DIR+index_z-1] += weight * x_interpolate * (1-z_interpolate);
						votes[DIR+index_z] += weight * x_interpolate * z_interpolate;
						votes[parm.block * DIR+DIR+index_z-1] += weight * (1-x_interpolate) * (1-z_interpolate);
						votes[parm.block * DIR+DIR+index_z] += weight * (1-x_interpolate) * z_interpolate;
					}
				}
			}
			else
			{
				if (index_y == 0)
				{
					if (index_z == 0)
					{
						votes[parm.block * DIR] += weight;
					}
					else if(index_z == 9)
					{
						votes[parm.block * DIR+8] += weight;
					}
					else
					{
						votes[parm.block * DIR+index_z-1] += weight * (1-z_interpolate);
						votes[parm.block * DIR+index_z] += weight * z_interpolate;
					}
				}
				else if (index_y == 1)
				{
					if (index_z == 0)
					{
						votes[parm.block * DIR] += weight * y_interpolate;
						votes[parm.block * DIR + DIR] += weight * (1-y_interpolate);
					}
					else if(index_z == 9)
					{
						votes[parm.block*DIR+8] += weight * y_interpolate;
						votes[parm.block*DIR+DIR+8] += weight * (1-y_interpolate);
					}
					else
					{
						votes[parm.block*DIR+index_z-1] += weight * y_interpolate * (1-z_interpolate);
						votes[parm.block*DIR+index_z] += weight * y_interpolate * z_interpolate;
						votes[parm.block*DIR+DIR+index_z-1] += weight * (1-y_interpolate) * (1-z_interpolate);
						votes[parm.block*DIR+DIR+index_z] += weight * (1-y_interpolate) * z_interpolate;
					}
				}
				else
				{
					if (index_z == 0)
					{
						votes[parm.block*DIR+DIR] += weight;
					}
					else if(index_z == 9)
					{
						votes[parm.block*DIR+DIR+9] += weight;
					}
					else
					{
						votes[parm.block*DIR+DIR+index_z-1] += weight * (1-z_interpolate);
						votes[parm.block*DIR+DIR+index_z] += weight * z_interpolate;
					}
				}
			}
		}
	}
	
	
	normL2(votes);
	
}