Mean-Shift Algorithm and Its Application

最新推荐文章于 2024-01-31 01:28:17 发布

Quebradawill

最新推荐文章于 2024-01-31 01:28:17 发布

阅读量1.4k

点赞数 1

分类专栏： Pattern Recognition Computer Vision

本文链接：https://blog.csdn.net/qiudw/article/details/8623922

版权

Computer Vision 同时被 2 个专栏收录

5 篇文章 0 订阅

订阅专栏

Pattern Recognition

1 篇文章 0 订阅

订阅专栏

Issues for density estimation

how to represent density
how to extract the important information
- local maxima, minima
- gradient
- mode

Multivariant kernel density estimation

$f(x) = \frac{1}{nh^d} \sum_{i=1}^n \frac{1}{h} K \left( \frac{x-x_i}{h} \right )$

Kernels

Gaussian

$K_N = (2 \pi)^{-d/2} \exp \left( - \frac{1}{2} ||x||^2 \right )$

Epanechnikov

$K_E = \begin{cases} \frac{1}{2}c_d^{-1}(d+2)(1-||x||^2) & \textrm{if} \ x < 1 \\ 0 & \textrm{otherwise} \\ \end{cases}$

Uniform

$K_U = \begin{cases} c & ||x|| \leqslant 1 \\ 0 & \textrm{otherwise} \\ \end{cases}$

Basic idea

based on kernel density estimation
finding local optimum (mode)
density gradient estimation
iterative hill climbing algorithm

Benefit over the direct computation

computational complexity
- less density function evaluation
- only local computation

Gradient computation

Always converges to the local maximum

Variable Bandwidth Mean-Shift

Abramson's rule

$h_i (x_i) = h_0 \left[ \frac{\lambda}{f(x_i)} \right ]^{1/2}$

- $h_0$ : fixed bandwidth for initial estimation
- $\lambda$ : geometric mean

Mean-shift vector

$m(x) = \frac{\sum_{i=1}^n x_i g \left( || \frac{x-x_i}{h} ||^2 \right)}{\sum_{i=1}^n g \left( || \frac{x-x_i}{h} ||^2 \right)} - x$

Mean Shift Mode Detection

what happens if we reach a saddle point?
perturb the mode position and check if we return back

Original mean shift

find mode of

$c \sum_{i=1}^n k \left( \left|\left| \frac{y - x_i}{h} \right| \right|^2 \right )$

using

$y_1 = \frac{\sum_{i=1}^n x_i g \left( \left|\left| \frac{y_0 - x_i}{h} \right| \right|^2 \right )}{\sum_{i=1}^n g \left( \left|\left| \frac{y_0 - x_i}{h} \right| \right|^2 \right )}$

Extended mean shift

find mode of

$c \sum_{i=1}^n w_i k \left( \left|\left| \frac{y - x_i}{h} \right| \right|^2 \right )$

using

$y_1 = \frac{\sum_{i=1}^n x_i w_i g \left( \left|\left| \frac{y_0 - x_i}{h} \right| \right|^2 \right )}{\sum_{i=1}^n w_i g \left( \left|\left| \frac{y_0 - x_i}{h} \right| \right|^2 \right )}$

Mean Shift Properties

automatic convergence speed - the mean shift vecotr size depends on the gradient itself
near maxima, the steps are small and refined
convergence is guaranteed
for Uniform Kernel, convergence is achieved in a finite number of steps
Norm Kernel exhibits a smooth trajectory, but is slower than Uniform Kernel

Mean Shift strengths

application independent tool
suitable for real data analysis
does not assume any prior shape (e.g. elliptical) on data clusters
can handle arbitrary feature spaces
only ONE parameter to choose
h (window size) has a physical meaning, unlike k-means

Mean Shift weaknesses

the window size (bandwidth selection) is not trival
inappropriate window size can cause modes to be merged, or generate additional "shallow" modes, so it should use adaptive window size

Applications

pattern recognition
- clustering
image processing
- filtering
- segmentation
  - run filtering (discontinuity preserving smoothing)
  - cluster the clusters which are closer than window size
- discontinuity preserving smoothing

$K(x) = C \cdot k_s \left( \left|\left|\frac{x_s}{h_s} \right|\right| \right ) \cdot k_r \left( \left|\left|\frac{x_r}{h_r} \right|\right| \right )$

density estimation
- particle filter
mid-level application
- tracking
- background subtraction

Application - tracking

target representation (start from the position of the model in the current frame)

$\hat{q}_u = C \sum_{i=1}^n K(x_i^*)\delta [b(x_i^*) - u)]$

candidate representation (search in the model's neighborhood in next frame)

$\hat{p}_u (y) = C_h \sum_{i=1}^n K \left(\frac{y-x_i}{h} \right) \delta [b(x_i) - u)]$

bhattacharyya distance - it is the angle between the two vectors (find best candidate by maximizing a similarity function)

$\hat{\rho} (y) = \rho[\hat{p}(y), \hat{q}] = \sum_{u=1}^m \sqrt{\hat{p}_u(y) \hat{q}_u}$

$\rho[\hat{p}(y), \hat{q}] \approx \frac{1}{2} \sum_{u=1}^m \sqrt{\hat{p}_u (\hat{y}_0) \hat{q}_u} +\frac{1}{2} \sum_{u=1}^m \hat{p}_u(y)\sqrt{\frac{\hat{q}_u}{\hat{p}_u (\hat{y}_0)}}$

$\rho[\hat{p}(y), \hat{q}] \approx \frac{1}{2} \sum_{u=1}^m \sqrt{\hat{p}_u (\hat{y}_0) \hat{q}_u} + \frac{C_h}{2} \sum_{i=1}^{n_h} w_i k \left( \left| \left| \frac{y - x_i}{h} \right| \right|^2 \right )$