1. KSH Formulation
Data Samples:
Hashing Function: ,
is the kernel function; is the weight for the sample (similar to SVM, is the support vector and is the coressponding weight); is the threshold and usualy be median to make, usualy use the mean to replace median and.
Thus, ,
In vector format: .
Since completely determines a hash function, we seek to learn by leveraging supervised information so that the resulted hash function is discriminative.
Define label matrix:
The purpose of supervised hashing is to make a pair with will have the minimal Hamming distance 0, while a pair with
will take on the maximal Hamming distance.But the hamming distance is nonconvex, directly optimizing is nontrivial. Meanwhile, code inner products are easier to manipulate and optimize, and the relation between hamming distance and code inner product is like foloows:
, is the hashing bits.
Thus , and, let to fit.
Based on the above description, the objective function can be given in least-squares style:
where, denotes the hamming code matrix, and
, and represents frobenius norm:
And,
,
Thus the objective function can be rewrited as:
2. Greedy Optimization
The above objective function inspires a greedy idea for solving's sequentially: at a time, it only involves solving one vector provided with the previously solved vectors , ......,.
Define residue matrix:
Thus, objective function can be rewrited as:
The equivalent optimization problem:
2.1Spectral Relaxation(remove sign() directly)
Based on the above, the objective function is as folows, it is a generalized eigenvalue problem.
is the eigenvector of, thus
Based on which, will be the eigenvector of, and the maximize value will be the largest eigenvalue.
But it might deviate far away from the optimal solution under larger (e.g., ≥ 5,000) due to the amplified relaxation error. It is therefore usedas the initialization to a more principled optimization scheme as folows.
2.2 Sigmoid Smoothing(replace sign() with sigmoid fun)
Define sigmoid function:
The objective function is:
Use gradient descent to solve the above optimization problem:
3. The pseudo-code of the alogrithm
4. libLBFGS optimization libaray
The library is C implementation of Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method, use which to solve the above optimization problem.
5. Reference
1. http://www.ee.columbia.edu/~wliu/CVPR12_ksh.pdf
2. http://www.sanjivk.com/SSH_CVPR10.pdf
3. http://www.chokkan.org/software/liblbfgs/
4. http://www.sanjivk.com/EECS6898/ApproxNearestNeighbors_1.pdf
5. http://www.sanjivk.com/EECS6898/ApproxNearestNeighbors_2.pdf