给定一副图像I,如何有效地计算图像上每个位置的梯度Ix,Iy,梯度幅值M,方向角Theta:
Ix(x,y) =I(x+1,y) - I(x-1,y)
Iy(x,y) =I(x,y+1) -I(x,y-1)
M(x,y) = sqrt(Ix(x,y)*Ix(x,y) +Iy(x,y)*Iy(x,y) )
Theta(x,y) = atan2(Iy(x,y),Ix(x,y) )
从上面的公示来看,计算起来并不困难,oepnCV里有Sobel函数,可以直接计算出Ix,Iy。这里我们主要讲的是高效率编程,自己编程来解决这个问题。
Intel生产的CPU基本上都支持SSE指令集,这一指令集允许同时对四个单精度浮点数进行运算,比如在一个CPU的clock里完成计算4个float与另外4个float的分别对应相乘。我在之前的博文里公开了我写的DPM目标检测代码,其中用到了SSE,那是我第一次使用SSE进行加速计算,效果颇好。之后有幸看到了Piotr Dollar大神的关于行人检测的公开代码(文章是:The Fastest Pedestrian Detector in the West),里面的mex函数文件大量使用了SSE指令集,让我对SSE编程有了进一步了解。
我以前用OpenCV的Sobel函数来计算Ix,Iy,然后再算M和Theta,一方面效率不够理想,另一方面程序的可控性较差——自己想在计算过程中增加一两个简单运算不太方便。由此我自己写了相关代码,几经修改,形成了几个不同的版本,下面将这些代码一一贴出来并作简单讲解。
------------------------------------
代码一:oriented_gradient.cpp
说明:该cpp中定义了两个版本的yuOrientedGradient函数,该函数的功能是,输入一幅灰度或彩色图像,计算图像上每个像素位置的梯度幅值和方向角。若输入是多通道图像,梯度幅值是取各个颜色通道梯度幅值的最大值。输出的方向角不是实数值,而是离散的整数值,比如指定orientation_bins=9,sensitive=true,则将一个取值在[0,2*pi)的方向角划分到[0,20)(度),[20,40),[40,60),...,[340,360)共18个bin中的一个。这种做法其实是为了后续进一步计算HOG特征服务的。如果将orientation的数据类型改为int型,将orientation_bins设置为360,则可以计算出每个像素位置方向角的角度,精度为1°. 继续提高orientation_bins的值,可以增加方向角的估计精度。
#include "cv.h"
using namespace cv;
// The two versions of function proposed here share the same functionality whereas the V2 is relatively faster.
void yuOrientedGradient( const Mat &img, Mat &orientation, Mat &gradient, int orientation_bins, bool sensitive );
void yuOrientedGradient_V2( const Mat &img, Mat &orientation, Mat &gradient, int orientation_bins, bool sensitive );
/* test:
Mat im = imread("0001.jpg");
Mat imF; im.convertTo(imF,CV_32F);
Mat O1, O2, G1, G2;
yuOrientedGradient( imF, O1, G1, 9, true );
yuOrientedGradient( imF, O2, G2, 9, true );
absdiff( O1, O2, O1 );
absdiff( G1, G2, G1 );
double a, b;
minMaxLoc( G1, 0, &a );
minMaxLoc( O1, 0, &b );
cout<<a<<endl<<b<<endl; // a==0, b==0
*/
/*
Calculate the orientation at each pixel.
The calculated orientations are snapped to one of the N bins which are equally spaced in:
[0,180), if sensitive==true, then values of orientation are between [0,num_orientation_bins-1);
[0,360), if sensitive==false, then values of orientation are between [0,2*num_orientation_bins-1).
The output orientation (CV_8UC1) & gradient (CV_32FC1) are 2 pixels smaller both in rows
and in cols than input img (multi-channel,float type).
theta = angle( OP ), where O = (0,0), P = (dx,dy)
theta is then snapped to one of nine orientations [0,20) [20,40), ... , [160 180)
How to snap:
e.g. we set the bins as [0,20), [20,40), ..., [160,180),
then for any theta in [0,180), cos(theta-i*20) achieves max when theta is in [i*20,i*20+20)
as: cos(a-b) = cos(a)*cos(b) + sin(a)*sin(b)
so: cos(theta-i*20) = cos(theta)*cos(i*20) + sin(theta)*sin(i*20)
now that: cos(theta) = x/sqrt(x^2+y^2), sin(theta) = y/sqrt(x^2+y^2)
so: x*cos(i*20)+y*sin(i*20) will achieve max when the orientation of (x,y) is in [i*20,i*20+20)
make: uu = [cos(0) cos(20) ... cos(160)]; vv = [sin(0) sin(20) ... sin(180)];
then: x*uu(i)+y*vv(i) achieves max when theta is in [i*20,i*20+20), namely the i-th orientation bin.
by: YU Xianguo, 2015/06/24
*/
void yuOrientedGradient( const Mat &img, Mat &orientation, Mat &gradient, int orientation_bins, bool sensitive )
{
assert( img.depth()==CV_32F );
assert( img.rows>2 && img.cols>2 );
assert( orientation_bins>1 && orientation_bins<256 );
// create output: only calc for img(Rect(1,1,cols-2,rows-2)).
int rows = img.rows, cols = img.cols, chans = img.channels();
orientation.create( rows-2, cols-2, CV_8UC1 );
gradient.create( rows-2, cols-2, CV_32FC1 );
// calculate gradient for img(Rect(1,1,cols-2,rows-2))
// multi-channel operation
Mat Left = img( Rect(0,1,cols-2,rows-2) );
Mat Right = img( Rect(2,1,cols-2,rows-2) );
Mat Up = img( Rect(1,0,cols-2,rows-2) );
Mat Down = img( Rect(1,2,cols-2,rows-2) );
Mat _Dx = Right - Left;
Mat _Dy = Down - Up;
Mat Dx, Dy;
if( chans==1 ){
Dx = _Dx;
Dy = _Dy;
gradient