1 Motivation
Objects of interest sometimes vary significantly in size.
1.1 Sliding Window Method
Apply classification network over multiple locations and multiple scales on a image. However, many viewing windows may contain a perfectly identifiable portion of the object (say, the head of a dog), but not the entire object, nor even the center of the object. This leads to decent classification but poor localization.
Thus, we train the system to not only produce a distribution over categories for each window, but also to produce a prediction of the location and size of the bounding box
containing the object relative to the window.
1.2 Classification and Regression Network
We run classification and reg