opencv 学习笔记-入门(22)之快速hog的减少取样实现-论文笔记

[Pang, 2011] Y. Pang, Y. Yuan, X. Li, et al. Efficient HOG human detection [J]. Signal Processing, 2011, 91: 773-781.  阅读笔记 

We call this method as cell-based trilinear interpolation.


To decrease the computation complexity, we develop a sub-cell based trilinear interpolation. 

 The proposed method avoids unimportant interpolation by omitting the gradients whose locations are far from the cell (sub-cell) in concern. Specifically, each cell (8×8 pixels) in a block (16×16 pixels) is divided into 4 sub-cells (there are 4×4 pixels in a sub-cell). We classify the 4 sub-cells into 3 types: corner sub-cell, inner sub-cell, and semi-inner sub-cell. As illustrated in Fig. 4, a block contains 4 cells: C1C2C3, and C4 (see Fig. 4(a)) and Ci contains 4 sub-cells: Ci1Ci2Ci3, and Ci4 (see Fig. 4(b)). Note from Fig. 4(c) that C11C22C33, and C44 are called corner sub-cells, C14,C23C41, andC32 are called inner sub-cells, and C12C21C24C42C43C34C31, and C13 are called semi-inner sub-cells. In our algorithm the 3 types of sub-cells have different roles in computing the HOG features:

(1)

Because that the corner sub-cells are at the corners of a block and are far from the other 3 cells, the gradients in the corner sub-cells are merely used to compute the histogram corresponding to their own cells. That is, the gradients in C11C22C33, and C44 contribute only to the histograms of C1C2C3, and C4, respectively.

(2)

Because that the inner sub-cells are near to all the 4 cells, so we let the gradients in the inner sub-cells to contribute to the histograms of all the 4 cells.

(3)

Each semi-inner sub-cell in a cell is a neighbor of a unique cell. Therefore, the gradients in each semi-inner cell are involved in computing the histograms of its own cell and its neighbor cell. Take the semi-inner cell C13 for example, C13 is contained in C1 and is a neighbor of C3. So the gradients in C13 are used for computing the histograms of C1 and C3, but they are independent to the histograms of C2 and C4.

Fig. 4. Sub-cell based representation: (a) a 16×16 block consisting of four 8×8 cells: C1C2C3, and C4. (b) Each 8×8 cell Ci is decomposed into 4 sub-cells: Ci1Ci2Ci3, and Ci4, which have 3 types: corner sub-cell, inner sub-cell, and semi sub-cell.



Let h1 denote the HOG features of cell C1. Mathematically loosing, the computation of h1 is given by




3.3. Efficient implementation of pixel based interpolation

Now the question is how to compute the item hist(Cij) of Eq. (6). A straightforward way is to utilize Eqs.(1) and (2) to calculate each bin of hist(Cij). According to Eqs. (1) and (2), the contribution of gradient at location (x,y) to orientation θ2 is given by

(8)
where the coefficients are
(9)
(10)
(11)

Generally, it is time-consuming to compute Eq. (8) because it has to calculate the coefficients.

To speed up the computation process, we propose to compute the coefficients a and b offline and save them to a look-up-table. Given (x,y) the coefficients a and b can be obtained by the look-up-table. Note that the coefficient c cannot be computed by the look-up-table trick because the value of θ(x,y) is continuous instead of discrete. Using the look-up-table, computing Eq. (8) requires only 4 multiplications and 1 addition.

It is worth noting that Wang et al. [27] proposed to estimate Eq. (8) by convolving the gradients at (x,y) with the following 7 by 7 kernel:

(12)

Throughout this paper we call this method “convolution method” [27]. The convolution takes 50 multiplications and 49 additions. Even with fast Fourier transformation (FFT) the process takes more operations than our proposed look-up-table method. But the convolution method can be integrated with the trick of integral image [16], so it is faster than the traditional HOG method [10].


三线插值实现代码:

//三线插值  //针对快速实现的方法
void calHogBlock(Mat& grad_ang, Point btl, float *votes)   //btl就是每个块每个块的头坐标
{
	static double *gaussmask = getGaussMask();
	
	int x = btl.x;
	int y = btl.y;
	int index_x, index_y, index_z;
	for (int i = 0; i < parm.block * parm.cell; i++)  //一个block 的一维的长度
	{
		for (index_x = 0; index_x < 2; index_x++)     //索引cell
		{
			if ((i - parm.mid[index_x]) < 0)          //要在那两个cell的右边
				break;
		}
		float *gpd = grad_ang.ptr<float>(x+i);        //在这个基础上的第几行
		double *spd = gaussmask + i*16; // i*parm.block*parm.cell; //每一行的第一个的指针
		for (int j = 0; j < parm.block * parm.cell; j++)  
		{
			float temp = gpd[(j + y) * 2];  //每个的grident的值
			float angle = gpd[(j + y) * 2 + 1];   //*2 表示通道数
			for (index_y = 0; index_y < 2; index_y++)  //cell 的列坐标
			{
				if ((j - parm.mid[index_y]) < 0)
					break;
			}
			for (index_z = 0; index_z < DIR; index_z++)  //维度
			{
				if ((angle - dirs[index_z]) < 0)    //找到他所在的那个区间,然后进行投影
					break;   //angle比那个小的一个,即在那个比angle大一个的那个区段上限
			}

			float weight = float(spd[j] * temp);  
			float x_interpolate = (float)(i-parm.mid[0]) / parm.cell;  //插值的权重
			float y_interpolate = (float)(j-parm.mid[0])/ parm.cell;   //插值的权重
			float z_interpolate = (float)(dirs[index_z] - angle) * DIR/PI;   //

			if (index_x == 0)   
			{
				if (index_y == 0)
				{
					if (index_z == 0)
					{
						votes[0] += weight;     //
					}
					else if(index_z == 9)
					{
						votes[8] += weight;
					}
					else
					{	
						votes[index_z-1] += weight * (1-z_interpolate); 
						votes[index_z] += weight * z_interpolate;
					}
				}
				else if (index_y == 1)
				{
					if (index_z == 0)
					{
						votes[0] += weight * y_interpolate;
						votes[DIR] += weight * (1 - y_interpolate);
					}
					else if(index_z == 9)
					{
						votes[8] += weight * y_interpolate;
						votes[DIR+8] += weight * (1 - y_interpolate);
					}
					else
					{
						votes[index_z-1] += weight * y_interpolate * (1-z_interpolate);
						votes[index_z] += weight * y_interpolate * z_interpolate;
						votes[DIR+index_z-1] += weight * (1 - y_interpolate) * (1 - z_interpolate);
						votes[DIR+index_z] += weight * (1 - y_interpolate) * z_interpolate;
					}
				}
				else   //index_y = 2
				{
					if (index_z == 0)
					{
						votes[DIR] += weight;
					}
					else if(index_z == 9)
					{
						votes[DIR+8] += weight;
					}
					else
					{
						votes[DIR+index_z-1] += weight * (1 - z_interpolate);
						votes[DIR+index_z] += weight * z_interpolate;
					}
				}
			}
			else if(index_x == 1)
			{
				if (index_y == 0)
				{
					if (index_z == 0)
					{
						votes[0] += weight * x_interpolate;
						votes[parm.block * DIR] += weight * (1 - x_interpolate);
					}
					else if(index_z == 9)
					{
						votes[8] += weight * x_interpolate;
						votes[parm.block * DIR + 8] += weight * (1 - x_interpolate);
					}
					else
					{
						votes[index_z-1] += weight * x_interpolate * (1-z_interpolate);
						votes[index_z] += weight * x_interpolate * z_interpolate;
						votes[parm.block * DIR + index_z-1] += weight * (1-x_interpolate) * (1-z_interpolate);
						votes[parm.block * DIR + index_z] += weight * (1-x_interpolate) * z_interpolate;
					}
				}
				else if (index_y == 1)
				{
					if (index_z == 0)
					{
						votes[0] += weight * x_interpolate * y_interpolate;
						votes[DIR] += weight * x_interpolate * (1-y_interpolate);
						votes[parm.block * DIR] += weight * (1-x_interpolate) * y_interpolate;
						votes[parm.block * DIR + DIR] += weight * (1-x_interpolate) * (1-y_interpolate);
					}
					else if(index_z == 9)
					{
						votes[8] += weight * x_interpolate * y_interpolate;
						votes[DIR + 8] += weight * x_interpolate * (1-y_interpolate);
						votes[parm.block * DIR + 8] += weight * (1-x_interpolate) * y_interpolate;
						votes[parm.block * DIR + DIR + 8] += weight * (1-x_interpolate) * (1-y_interpolate);
					}
					else
					{
						votes[index_z-1] += weight * x_interpolate * y_interpolate * (1-z_interpolate);
						votes[index_z] += weight * x_interpolate * y_interpolate * z_interpolate;
						votes[DIR+index_z-1] += weight * x_interpolate * (1-y_interpolate) * (1-z_interpolate);
						votes[DIR+index_z] += weight * x_interpolate * (1-y_interpolate) * z_interpolate;
						votes[parm.block * DIR+index_z-1] += weight * (1-x_interpolate) * y_interpolate * (1-z_interpolate);
						votes[parm.block * DIR+index_z] += weight * (1-x_interpolate) * y_interpolate * z_interpolate;
						votes[parm.block * DIR+DIR+index_z-1] += weight * (1-x_interpolate) * (1-y_interpolate) * (1-z_interpolate);
						votes[parm.block * DIR+DIR+index_z] += weight * (1-x_interpolate) * (1-y_interpolate) * z_interpolate;
					}
				}
				else
				{
					if (index_z == 0)
					{
						votes[DIR] += weight * x_interpolate;
						votes[parm.block * DIR + DIR] += weight * (1-x_interpolate);
					}
					else if(index_z == 9)
					{
						votes[8+DIR] += weight * x_interpolate;
						votes[parm.block * DIR+ DIR+8] += weight * (1-x_interpolate);
					}
					else
					{
						votes[DIR+index_z-1] += weight * x_interpolate * (1-z_interpolate);
						votes[DIR+index_z] += weight * x_interpolate * z_interpolate;
						votes[parm.block * DIR+DIR+index_z-1] += weight * (1-x_interpolate) * (1-z_interpolate);
						votes[parm.block * DIR+DIR+index_z] += weight * (1-x_interpolate) * z_interpolate;
					}
				}
			}
			else
			{
				if (index_y == 0)
				{
					if (index_z == 0)
					{
						votes[parm.block * DIR] += weight;
					}
					else if(index_z == 9)
					{
						votes[parm.block * DIR+8] += weight;
					}
					else
					{
						votes[parm.block * DIR+index_z-1] += weight * (1-z_interpolate);
						votes[parm.block * DIR+index_z] += weight * z_interpolate;
					}
				}
				else if (index_y == 1)
				{
					if (index_z == 0)
					{
						votes[parm.block * DIR] += weight * y_interpolate;
						votes[parm.block * DIR + DIR] += weight * (1-y_interpolate);
					}
					else if(index_z == 9)
					{
						votes[parm.block*DIR+8] += weight * y_interpolate;
						votes[parm.block*DIR+DIR+8] += weight * (1-y_interpolate);
					}
					else
					{
						votes[parm.block*DIR+index_z-1] += weight * y_interpolate * (1-z_interpolate);
						votes[parm.block*DIR+index_z] += weight * y_interpolate * z_interpolate;
						votes[parm.block*DIR+DIR+index_z-1] += weight * (1-y_interpolate) * (1-z_interpolate);
						votes[parm.block*DIR+DIR+index_z] += weight * (1-y_interpolate) * z_interpolate;
					}
				}
				else
				{
					if (index_z == 0)
					{
						votes[parm.block*DIR+DIR] += weight;
					}
					else if(index_z == 9)
					{
						votes[parm.block*DIR+DIR+9] += weight;
					}
					else
					{
						votes[parm.block*DIR+DIR+index_z-1] += weight * (1-z_interpolate);
						votes[parm.block*DIR+DIR+index_z] += weight * z_interpolate;
					}
				}
			}
		}
	}
	
	
	normL2(votes);
	
}



评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值