Part-based Object Detection Method
Circle Hough Transform
- Radius-known case: the model parameter space is two dimensional: the coordinate of center.
- Procedure:
- discretize model parameter space and create accumulation matrix;
- calculate the pixel gradient (including magnitude and direction) of image, and choose those with high enough response as key point;
- Each key point will vote for a set of model parameters. While models with strong image support will accumulate most votes.
- Radius-Unknown Case:
- Either we have to add one dimension of the model parameter space – radius dimension and restart the discretization of model space and vote.
- So curse of dimensionality: complexity increases exponentially with the number of model parameter
- Instead, with the knowledge of the gradient direction, the center must locate on this line of direction – Thus transform the conical surface into a line in parameter space
- Either we have to add one dimension of the model parameter space – radius dimension and restart the discretization of model space and vote.
- Generalized Hough Transform: with the introduction of predefined reference point and pre-calculated R-table.
- The purpose of GHT is still construct model parameter space, which corresponding to the find the most supported reference point.
- Pros:
- Points are processed indpendently: robustness to noise and occlusion
- Cons:
- Non-object can also contribute to spurious peak in model space.
Implicit Shape Model
- Divided into two procedure:
- Learning a object model:
- Extract interest point, compute descriptor and offset to a reference point;
- Cluster descriptors and offsets to form a visual dictionary
- Object detection:
- Extract interest point, compute descriptor
- Find the most similar descriptor and vote for the possible reference point
- Learning a object model:
- Non-maxima suppression: find the strongest response
Window-based Object Detection Method
- Sliding window: direct template matching - normalized cross-correlation
- Viola-Jones detector: for person face detection:
- A very unbalanced tree
- Deformable part model: relative placement of part is allowed to change
Window proposal:
- Instead of searching all over the window, only search candidate wondiow which may contain object.
- Also accompany image segmentation
- Not suitable for diffuse or low-contrast object
- few false positive but more false negative.
Histogram of oriented gradient
In the HOG feature descriptor, the distribution ( histograms ) of directions of gradients ( oriented gradients ) are used as features.
- Procedure:
- Calculate the gradient (including magnitude and direction) of each pixel. For 3 Channels image, we simply use the maximum magnitude among three channels and its corresponding direction (angle).
- Divide the whole image patch into 8×8 8 × 8 subpatch. Note that the image patch has already been scaled a patch of 64×128×3 64 × 128 × 3 .
- Each 8×8 8 × 8 subpatch creates a histogram. The horizontal axis of histogram is divided into 9 bins, which corresponding to 9 separated group of the direction of gradient(angle). Here the angle (direction of gradient) ranges from 0 to 180 degree.
- Each pixel will contributes to its corresponding bin according to its direction. The magnitude of contribution is proportional to its magnitude.
- Normalized histogram by forming a 16×16 16 × 16 block, to eliminate the effect of brightness variation. Then a vector of 36×1 36 × 1 is created.