mean-shift算法,通过已知的上一帧图像的坐标,计算上一帧的图像的二维直方图特征,在下一帧的相同位置附近进行匹配,并逐渐将下一帧的跟踪坐标收敛。
在这里具体数学推导暂先不理会,算法大致实现分如下几个部分:
一、目标区域的二维直方图特征图构建
def Colourfeature(red, green, blue):
r = red / 256.0
g = green / 256.0
b = blue / 256.0
Cb = (128 - 37.79*r - 74.203*g + 112*b) / 4
Cr = (128 + 112*r - 93.786*g - 18.214*b) / 4
return Cb, Cr
# s:[x, y] 先验中心坐标
# regionRadius:[W, H] 中心至两边距离
# h: 在后文中被定义为W和H的平方和
def Density(image, s, regionRadius, h=2548):
x = s[0]
y = s[1]
box_w = regionRadius[0]
box_h = regionRadius[1]
b, g, r = cv2.split(image)
histSize = 64
histGram = np.zeros((histSize,histSize))
total = 0.0
#x-box_w : x+box_w
for deltaRow in range(2*box_h + 1):
for deltaCol in range(2*box_w + 1):
col = x + (deltaCol - box_w)
row = y + (deltaRow - box_h)
Cb, Cr = Colourfeature(r[int(row), int(col)], g[int(row), int(col)], b[int(row), int(col)])
w = (cmath.exp(-(deltaRow*deltaRow + deltaCol*deltaCol)/ (2*h*h))).real
histGram[int(Cb), int(Cr)] = histGram[int(Cb), int(Cr)] + w
total = total + w
histGram = histGram / total
return histGram
二、针对一帧图像,通过目标图像的直方图特征以及上一帧中目标图像的位置 ,预测该帧中图像的位置
#image:该帧图像
#q:目标图像的直方图特征
#s,regionRadius均为上一帧图像中目标图像的位置
#h:也是定义为H和W平方和
def mean_shift(image, q, s, regionRadius, h=2548):
b, g, r = cv2.split(image)
x = s[0]
y = s[1]
box_w = regionRadius[0]
box_h = regionRadius[1]
wSize = box_h * box_w
w = np.zeros((wSize, wSize))
qs = Density(image, s, regionRadius, h)
for deltaRow in range(2*box_h + 1):
for deltaCol in range(2*box_w + 1):
col = x + (deltaCol - box_w)
row = y + (deltaRow - box_h)
Cb, Cr = Colourfeature(r[int(row), int(col)], g[int(row), int(col)], b[int(row), int(col)])
wCol = col + box_w
wRow = row + box_h
if(qs[int(Cb), int(Cr)]) > 0:
w[int(wRow), int(wCol)] = (cmath.sqrt(q[int(Cb), int(Cr)] / qs[int(Cb), int(Cr)])).real
meanSum_row = 0
meanSum_col = 0
kernelSum = 0
for deltaRow in range(2*box_h + 1):
for deltaCol in range(2*box_w + 1):
col = x + (deltaCol - box_w)
row = y + (deltaRow - box_h)
wCol = col + box_w
wRow = row + box_h
g = (cmath.exp(-(deltaRow*deltaRow + deltaCol*deltaCol)/ (2*h*h))).real
meanSum_row = meanSum_row + w[int(wRow), int(wCol)] * g * row
meanSum_col = meanSum_col + w[int(wRow), int(wCol)] * g * col
kernelSum = kernelSum + w[int(wRow), int(wCol)] * g
# y = row x = col
new_X = round(meanSum_col / kernelSum)
new_Y = round(meanSum_row / kernelSum)
return new_X, new_Y
三、多次调用,直至坐标收敛
def track(image, q, s, regionRadius, h):
x = s[0]
y = s[1]
x, y = mean_shift(image.copy(), q, [x, y], regionRadius, h)
while(1):
num = num + 1
x_last = x
y_last = y
x, y = mean_shift(image.copy(), q, [x, y], regionRadius, h)
if(x==x_last and y==y_last):
break
return x, y
这样得到的x,y即为当前帧中,目标图像被跟踪到的位置。
在以上代码中,h参数被设定为H与W的平方和,由于mean-shift对于跟踪目标的区域大小是固定的,故也可以认为h是固定值,而在cam-shift中,区域则是通过后投影自适应的,这里不多描述。
再通过录视频,将上面的方法连续使用就可以完成视频中每一帧的跟踪,虽然处理速度还有待提高,不过效果也能看,平常猎奇玩一下还是挺高兴的哈哈。