Camshift算法是Continuously Adaptive Mean Shift algorithm的简称。它是一个基于MeanSift的改进算法。它首次由Gary R.Bradski等人提出和应用在人脸的跟踪上,并取得了不错的效果。由于它是利用颜色的概率信息进行的跟踪,使得它的运行效率比较高。 Camshift算法的过程由下面步骤组成:
(1)确定初始目标及其区域;
(2)计算出目标的色度(Hue)分量的直方图;
(3)利用直方图计算输入图像的反向投影图(后面做进一步的解释);
(4)利用MeanShift算法在反向投影图中迭代收索,直到其收敛或达到最大迭代次数。并保存零次矩;
(5)从第(4)步中获得收索窗口的中心位置和计算出新的窗口大小,以此为参数,进入到下一幀的目标跟踪。(即跳转到第(2)步);
几点说明:
1. 在输入图像进行反向投影图之前在HSV空间内做了一个阀值处理,用以滤掉一些噪声。
2. 反向投影图则是概率分布图,在反向投影图中某一像素点的值指的是这个点符合目标的概率分布的概率是多少,或者直接说其为目标图像像素点的像素点是多少。计算方法为:根据像素点的像素值查目标的直方图,其对应像素值的概率是多少就做为该点在反向投影图中的值。
3. Camshit算法到底是怎样自适应调整窗口的大小的。扩大:Canshift算法在计算窗口大小前,在MeanShift算出的窗口的四个方向上增大了 TOLERANCE,即高和宽都增大了2TOLERANCE(此值自己调整设置),这才有可能使得窗口能够变大。缩小:在扩大的窗口内重新计算0阶矩,1 阶矩和2阶矩,利用矩的值重新计算高和宽。因此Camshif算法相当于在MeanShift的结果上,再做了一个调整,从而使得跟踪的窗口大小能够随目 标的大小变化。
优点:算法的效率比较高,如果能利用多少特征做出来的统计直方图,我估计实验效果会更好。
缺点:(1)只利用颜色统计做的跟踪,在背景有相似颜色时,会出现跟踪错误的情况。(2)不能做多目标跟踪。(3)由于它只在初始位置(而不是从每个像素点)开始迭代,所以有可能在初始位置错了后,收敛的位置还是原位置(即跟丢了后,可能会找不回来)。
问题:论文中有关于窗口大小调整,是根据直方图来迭代求解,不知是怎么回事?在代码中没看到实现。在此向大家请教!
下面是Camshift算法Demo的代码:
代码
1 #ifdef _CH_
2 #pragma package <opencv>
3 #endif
4
5 #define CV_NO_BACKWARD_COMPATIBILITY
6
7 #ifndef _EiC
8 #include " cv.h "
9 #include " highgui.h "
10 #include < stdio.h >
11 #include < ctype.h >
12 #endif
13
14 IplImage * image = 0 , * hsv = 0 , * hue = 0 , * mask = 0 , * backproject = 0 , * histimg = 0 ;
15 CvHistogram * hist = 0 ;
16
17 int backproject_mode = 0 ;
18 int select_object = 0 ;
19 int track_object = 0 ;
20 int show_hist = 1 ;
21 CvPoint origin;
22 CvRect selection;
23 CvRect track_window;
24 CvBox2D track_box;
25 CvConnectedComp track_comp;
26 int hdims = 16 ;
27 float hranges_arr[] = { 0 , 180 };
28 float * hranges = hranges_arr;
29 int vmin = 10 , vmax = 256 , smin = 30 ;
30
31 void on_mouse( int event , int x, int y, int flags, void * param )
32 {
33 if ( ! image )
34 return ;
35
36 if ( image -> origin )
37 y = image -> height - y;
38
39 if ( select_object ) // 表明还正在框选目标
40 {
41 selection.x = MIN(x,origin.x);
42 selection.y = MIN(y,origin.y);
43 selection.width = selection.x + CV_IABS(x - origin.x);
44 selection.height = selection.y + CV_IABS(y - origin.y);
45
46 // 保证数据的有效性
47 selection.x = MAX( selection.x, 0 );
48 selection.y = MAX( selection.y, 0 );
49 selection.width = MIN( selection.width, image -> width );
50 selection.height = MIN( selection.height, image -> height );
51 selection.width -= selection.x;
52 selection.height -= selection.y;
53 }
54
55 switch ( event )
56 {
57 case CV_EVENT_LBUTTONDOWN: // 框选目标
58 origin = cvPoint(x,y);
59 selection = cvRect(x,y, 0 , 0 );
60 select_object = 1 ;
61 break ;
62 case CV_EVENT_LBUTTONUP: // 框选结束
63 select_object = 0 ;
64 if ( selection.width > 0 && selection.height > 0 )
65 track_object = - 1 ;
66 break ;
67 }
68 }
69
70
71 CvScalar hsv2rgb( float hue )
72 {
73 int rgb[ 3 ], p, sector;
74 static const int sector_data[][ 3 ] =
75 {{ 0 , 2 , 1 }, { 1 , 2 , 0 }, { 1 , 0 , 2 }, { 2 , 0 , 1 }, { 2 , 1 , 0 }, { 0 , 1 , 2 }};
76 hue *= 0.033333333333333333333333333333333f ;
77 sector = cvFloor(hue);
78 p = cvRound( 255 * (hue - sector));
79 p ^= sector & 1 ? 255 : 0 ;
80
81 rgb[sector_data[sector][ 0 ]] = 255 ;
82 rgb[sector_data[sector][ 1 ]] = 0 ;
83 rgb[sector_data[sector][ 2 ]] = p;
84
85 return cvScalar(rgb[ 2 ], rgb[ 1 ], rgb[ 0 ], 0 );
86 }
87
88 int main( int argc, char ** argv )
89 {
90 CvCapture * capture = 0 ;
91
92 if ( argc == 1 || (argc == 2 && strlen(argv[ 1 ]) == 1 && isdigit(argv[ 1 ][ 0 ])))
93 capture = cvCaptureFromCAM( argc == 2 ? argv[ 1 ][ 0 ] - ' 0 ' : 0 );
94 else if ( argc == 2 )
95 capture = cvCaptureFromAVI( argv[ 1 ] );
96
97 if ( ! capture )
98 {
99 fprintf(stderr, " Could not initialize capturing.../n " );
100 return - 1 ;
101 }
102
103 printf( " Hot keys: /n "
104 " /tESC - quit the program/n "
105 " /tc - stop the tracking/n "
106 " /tb - switch to/from backprojection view/n "
107 " /th - show/hide object histogram/n "
108 " To initialize tracking, select the object with mouse/n " );
109
110 cvNamedWindow( " Histogram " , 1 );
111 cvNamedWindow( " CamShiftDemo " , 1 );
112 cvSetMouseCallback( " CamShiftDemo " , on_mouse, 0 );
113 cvCreateTrackbar( " Vmin " , " CamShiftDemo " , & vmin, 256 , 0 );
114 cvCreateTrackbar( " Vmax " , " CamShiftDemo " , & vmax, 256 , 0 );
115 cvCreateTrackbar( " Smin " , " CamShiftDemo " , & smin, 256 , 0 );
116
117 for (;;)
118 {
119 IplImage * frame = 0 ;
120 int i, bin_w, c;
121
122 frame = cvQueryFrame( capture );
123 if ( ! frame )
124 break ;
125
126 if ( ! image )
127 {
128 /* allocate all the buffers */
129 image = cvCreateImage( cvGetSize(frame), 8 , 3 );
130 image -> origin = frame -> origin;
131 hsv = cvCreateImage( cvGetSize(frame), 8 , 3 );
132 hue = cvCreateImage( cvGetSize(frame), 8 , 1 );
133 mask = cvCreateImage( cvGetSize(frame), 8 , 1 );
134 backproject = cvCreateImage( cvGetSize(frame), 8 , 1 );
135 hist = cvCreateHist( 1 , & hdims, CV_HIST_ARRAY, & hranges, 1 );
136 histimg = cvCreateImage( cvSize( 320 , 200 ), 8 , 3 );
137 cvZero( histimg );
138 }
139
140 cvCopy( frame, image, 0 );
141 cvCvtColor( image, hsv, CV_BGR2HSV );
142
143 if ( track_object )
144 {
145 int _vmin = vmin, _vmax = vmax;
146
147 cvInRangeS( hsv, cvScalar( 0 ,smin,MIN(_vmin,_vmax), 0 ),
148 cvScalar( 180 , 256 ,MAX(_vmin,_vmax), 0 ), mask ); // 去除噪声,在此数据内的值,确定mask为1
149 cvSplit( hsv, hue, 0 , 0 , 0 ); // 获得色调分量,并以此来做反向投影图
150
151 if ( track_object < 0 )
152 {
153 float max_val = 0 .f;
154 cvSetImageROI( hue, selection );
155 cvSetImageROI( mask, selection );
156 cvCalcHist( & hue, hist, 0 , mask ); // 计算选中部分直方图
157 cvGetMinMaxHistValue( hist, 0 , & max_val, 0 , 0 );
158 cvConvertScale( hist -> bins, hist -> bins, max_val ? 255 . / max_val : 0 ., 0 );
159 cvResetImageROI( hue );
160 cvResetImageROI( mask );
161 track_window = selection;
162 track_object = 1 ;
163
164 cvZero( histimg );
165 bin_w = histimg -> width / hdims;
166 for ( i = 0 ; i < hdims; i ++ )
167 {
168 int val = cvRound( cvGetReal1D(hist -> bins,i) * histimg -> height / 255 ); // 获取直方图的中每一项的高
169 CvScalar color = hsv2rgb(i * 180 .f / hdims); // 直方图每一项的颜色是根据项数变化的
170 cvRectangle( histimg, cvPoint(i * bin_w,histimg -> height), // 画直方图
171 cvPoint((i + 1 ) * bin_w,histimg -> height - val),
172 color, - 1 , 8 , 0 );
173 }
174 }
175
176 cvCalcBackProject( & hue, backproject, hist ); // 计算反向投影图backproject
177 cvAnd( backproject, mask, backproject, 0 ); // 去除上下阀值外的点后的投影图
178 cvCamShift( backproject, track_window, // 利用camshift搜索0-255的灰度图像
179 cvTermCriteria( CV_TERMCRIT_EPS | CV_TERMCRIT_ITER, 10 , 1 ),
180 & track_comp, & track_box );
181 track_window = track_comp.rect; // 获得新的跟踪窗口
182
183 if ( backproject_mode )
184 cvCvtColor( backproject, image, CV_GRAY2BGR );
185
186 if ( ! image -> origin ) // 如果为假,需要改变椭圆的角度
187 track_box.angle = - track_box.angle;
188 cvEllipseBox( image, track_box, CV_RGB( 255 , 0 , 0 ), 3 , CV_AA, 0 ); // 画跟踪椭圆
189 }
190
191 if ( select_object && selection.width > 0 && selection.height > 0 ) // 在框住的时候反向显示
192 {
193 cvSetImageROI( image, selection );
194 cvXorS( image, cvScalarAll( 255 ), image, 0 );
195 cvResetImageROI( image );
196 }
197
198 cvShowImage( " CamShiftDemo " , image );
199 cvShowImage( " Histogram " , histimg );
200
201 c = cvWaitKey( 10 );
202 if ( ( char ) c == 27 )
203 break ;
204 switch ( ( char ) c )
205 {
206 case ' b ' :
207 backproject_mode ^= 1 ;
208 break ;
209 case ' c ' :
210 track_object = 0 ;
211 cvZero( histimg );
212 break ;
213 case ' h ' :
214 show_hist ^= 1 ;
215 if ( ! show_hist )
216 cvDestroyWindow( " Histogram " );
217 else
218 cvNamedWindow( " Histogram " , 1 );
219 break ;
220 default :
221 ;
222 }
223 }
224
225 cvReleaseCapture( & capture );
226 cvDestroyWindow( " CamShiftDemo " );
227
228 return 0 ;
229 }
230
231 #ifdef _EiC
232 main( 1 , " camshiftdemo.c " );
233 #endif
这里主要介绍下MeanShift算法的迭代过程,毕竟Camshift算法是以它为核心的。MeanShift算法是一种寻找局部极值的方法。做 为一种直观上的理解是它一步一步爬向最高点即爬山算法.而怎么个爬法,用计算出的重心做为下一步窗口的中心,直到窗口的位置不再变化。在理解 MeanShift算法的时候,可以先不加入核函数(是计算距离对统计分布的影响)和权重函数(如人为主观的影响)。
在Camshift算法中MeanShift是通过1阶矩除以0阶矩来计算重心的。其算法的代码如下:
cvMeanShift( const void * imgProb, CvRect windowIn,
CvTermCriteria criteria, CvConnectedComp * comp )
{
CvMoments moments;
int i = 0 , eps;
CvMat stub, * mat = (CvMat * )imgProb; // 输入的整个图像
CvMat cur_win;
CvRect cur_rect = windowIn; // 当前矩形窗口初始化为输入窗口
CV_FUNCNAME( " cvMeanShift " );
if ( comp )
comp -> rect = windowIn; // 初始化联通区域
moments.m00 = moments.m10 = moments.m01 = 0 ; // 初始化0、1阶矩
__BEGIN__;
CV_CALL( mat = cvGetMat( mat, & stub ));
if ( CV_MAT_CN( mat -> type ) > 1 )
CV_ERROR( CV_BadNumChannels, cvUnsupportedFormat );
if ( windowIn.height <= 0 || windowIn.width <= 0 )
CV_ERROR( CV_StsBadArg, " Input window has non-positive sizes " );
if ( windowIn.x < 0 || windowIn.x + windowIn.width > mat -> cols || // x,y是指角点坐标而不是中心坐标
windowIn.y < 0 || windowIn.y + windowIn.height > mat -> rows )
CV_ERROR( CV_StsBadArg, " Initial window is not inside the image ROI " );
CV_CALL( criteria = cvCheckTermCriteria( criteria, 1 ., 100 )); // 迭代的结束条件,
eps = cvRound( criteria.epsilon * criteria.epsilon );
for ( i = 0 ; i < criteria.max_iter; i ++ )
{
int dx, dy, nx, ny;
double inv_m00;
CV_CALL( cvGetSubRect( mat, & cur_win, cur_rect )); // cur_win指向窗口内的数据
CV_CALL( cvMoments( & cur_win, & moments )); // 计算窗口内的各种矩
/* Calculating center of mass */
if ( fabs(moments.m00) < DBL_EPSILON )
break ;
inv_m00 = moments.inv_sqrt_m00 * moments.inv_sqrt_m00;
dx = cvRound( moments.m10 * inv_m00 - windowIn.width * 0.5 ); // 中心点的坐标-宽的一半
dy = cvRound( moments.m01 * inv_m00 - windowIn.height * 0.5 ); // 中心点的坐标-高的一半
nx = cur_rect.x + dx; // 新的x坐标
ny = cur_rect.y + dy; // 新的y坐标
if ( nx < 0 )
nx = 0 ;
else if ( nx + cur_rect.width > mat -> cols )
nx = mat -> cols - cur_rect.width;
if ( ny < 0 )
ny = 0 ;
else if ( ny + cur_rect.height > mat -> rows )
ny = mat -> rows - cur_rect.height;
dx = nx - cur_rect.x; // 重新
dy = ny - cur_rect.y;
cur_rect.x = nx; // 新窗口的坐标值
cur_rect.y = ny;
/* Check for coverage centers mass & window */
if ( dx * dx + dy * dy < eps ) // 迭代终止
break ;
}
__END__;
if ( comp ) // 返回矩形和0阶矩
{
comp -> rect = cur_rect;
comp -> area = ( float )moments.m00;
}
return i; // 返回迭代次数
}
Camshift算法代码:
cvCamShift( const void * imgProb, CvRect windowIn,
CvTermCriteria criteria,
CvConnectedComp * _comp,
CvBox2D * box )
{
const int TOLERANCE = 10 ;
CvMoments moments;
double m00 = 0 , m10, m01, mu20, mu11, mu02, inv_m00;
double a, b, c, xc, yc;
double rotate_a, rotate_c;
double theta = 0 , square;
double cs, sn;
double length = 0 , width = 0 ;
int itersUsed = 0 ;
CvConnectedComp comp;
CvMat cur_win, stub, * mat = (CvMat * )imgProb;
CV_FUNCNAME( " cvCamShift " );
comp.rect = windowIn; // 初始化comp
__BEGIN__;
CV_CALL( mat = cvGetMat( mat, & stub ));
CV_CALL( itersUsed = cvMeanShift( mat, windowIn, criteria, & comp )); // 调用meanshift计算质心
windowIn = comp.rect; // 获得新的窗口的位置
// 为了容错,窗口的四边都增大了TOLERANCE
windowIn.x -= TOLERANCE;
if ( windowIn.x < 0 )
windowIn.x = 0 ;
windowIn.y -= TOLERANCE;
if ( windowIn.y < 0 )
windowIn.y = 0 ;
windowIn.width += 2 * TOLERANCE;
if ( windowIn.x + windowIn.width > mat -> width )
windowIn.width = mat -> width - windowIn.x;
windowIn.height += 2 * TOLERANCE;
if ( windowIn.y + windowIn.height > mat -> height )
windowIn.height = mat -> height - windowIn.y;
CV_CALL( cvGetSubRect( mat, & cur_win, windowIn )); // 获得指向子窗口的数据指针
/* Calculating moments in new center mass */
CV_CALL( cvMoments( & cur_win, & moments )); // 重新计算窗口内的各种矩
m00 = moments.m00;
m10 = moments.m10;
m01 = moments.m01;
mu11 = moments.mu11;
mu20 = moments.mu20;
mu02 = moments.mu02;
if ( fabs(m00) < DBL_EPSILON )
EXIT;
inv_m00 = 1 . / m00;
xc = cvRound( m10 * inv_m00 + windowIn.x ); // 新的中心坐标
yc = cvRound( m01 * inv_m00 + windowIn.y );
a = mu20 * inv_m00;
b = mu11 * inv_m00;
c = mu02 * inv_m00;
/* Calculating width & height */
square = sqrt( 4 * b * b + (a - c) * (a - c) );
/* Calculating orientation */
theta = atan2( 2 * b, a - c + square );
/* Calculating width & length of figure */
cs = cos( theta );
sn = sin( theta );
rotate_a = cs * cs * mu20 + 2 * cs * sn * mu11 + sn * sn * mu02;
rotate_c = sn * sn * mu20 - 2 * cs * sn * mu11 + cs * cs * mu02;
length = sqrt( rotate_a * inv_m00 ) * 4 ; // 长与宽的计算
width = sqrt( rotate_c * inv_m00 ) * 4 ;
/* In case, when tetta is 0 or 1.57... the Length & Width may be exchanged */
if ( length < width )
{
double t;
CV_SWAP( length, width, t );
CV_SWAP( cs, sn, t );
theta = CV_PI * 0.5 - theta;
}
/* Saving results */
// 由于有宽和高的重新计算,使得能自动调整窗口大小
if ( _comp || box )
{
int t0, t1;
int _xc = cvRound( xc ); // 取整
int _yc = cvRound( yc );
t0 = cvRound( fabs( length * cs ));
t1 = cvRound( fabs( width * sn ));
t0 = MAX( t0, t1 ) + 2 ; // 宽的重新计算
comp.rect.width = MIN( t0, (mat -> width - _xc) * 2 ); // 保证宽不超出范围
t0 = cvRound( fabs( length * sn ));
t1 = cvRound( fabs( width * cs ));
t0 = MAX( t0, t1 ) + 2 ; // 高的重新计算
comp.rect.height = MIN( t0, (mat -> height - _yc) * 2 ); // 保证高不超出范围
comp.rect.x = MAX( 0 , _xc - comp.rect.width / 2 );
comp.rect.y = MAX( 0 , _yc - comp.rect.height / 2 );
comp.rect.width = MIN( mat -> width - comp.rect.x, comp.rect.width );
comp.rect.height = MIN( mat -> height - comp.rect.y, comp.rect.height );
comp.area = ( float ) m00;
}
__END__;
if ( _comp )
* _comp = comp;
if ( box )
{
box -> size.height = ( float )length;
box -> size.width = ( float )width;
box -> angle = ( float )(theta * 180 . / CV_PI);
box -> center = cvPoint2D32f( comp.rect.x + comp.rect.width * 0.5f ,
comp.rect.y + comp.rect.height * 0.5f );
}
return itersUsed;
}
这是一外国人写的使用示例:
OpenCV's face tracker uses an algorithm called Camshift. Camshift consists of four steps:
Here's how each step works: 1. Create a histogram Camshift represents the face it's tracking as a histogram (also called a barchart) of color values. Figure 1 shows two example histograms produced by the Camshift demo program that ships with OpenCV. The height of each colored bar indicates how many pixels in an image region have that "hue." Hue is one of three values describing a pixel's color in the HSV (Hue, Saturation, Value) color model. (For more on color, and color models, and see "The World of Color," SERVOMagazine , November 2005.) "Face probability" sounds terribly complicated, and heavily mathematical, but it's neither! Here's how it works. Figure 2 shows the bars from a histogram stacked one atop the other. After stacking them, it's clear that the rightmost bar accounts for about 45% of the pixels in the region. That means the probability that a pixel selected randomly from this region would fall into the rightmost bin is 45%. That's the "face probability" for a pixel with this hue. The same reasoning indicates that the face probability for the next histogram bin to the right is about 20%, since it accounts for about 20% of the stack's total height. That's all there is to it. | As new video frames arrive, the hue value for each pixel is determined. From that, the face histogram is used to assign a face probability to the pixel. This process is called "histogram backprojection" in OpenCV. There's a built-in method that implements it, called cvCalcBackProject(). Figure 3 shows the face-probability image in one video frame as Camshift tracks my face. Black pixels have the lowest probability value, and white, the highest. Gray pixels lie somewhere in the middle. 3. Shift to a new location With each new video frame, Camshift "shifts" its estimate of the face location, keeping it centered over the area with the highest concentration of bright pixels in the face-probability image. It finds this new location by starting at the previous location and computing the center of gravity of the face-probability values within a rectangle. It then shifts the rectangle so it's right over the center of gravity. It does this a few times to center the rectangle well. The OpenCV function cvCamShift()implements the steps for shifting to the new location. This process of shifting the rectangle to correspond with the center of gravity is based on an algorithm called "Mean Shift," by Dorin Comaniciu. In fact, Camshift stands for "Continuously Adaptive Mean Shift." 4. Calculate size and angle The OpenCV method is called "Continuously Adaptive," and not just "Mean Shift," because it also adjusts the size and angle of the face rectangle each time it shifts it. It does this by selecting the scale and orientation that are the best fit to the face-probability pixels inside the new rectangle location.
|