1. Goal
This paper mainly deals with Sparse Principal Component Analysis(PCA) using subspace method.
2. Theorey
2.1 How to get their formulation
Notation: λ1,λ2,⋯,λp are in decreasing order.
From Ky Fan’s maximum principal 1, we know that
If we regard the last formula as a function of VV′ , it is linear. So if we change the constrain to its convex hull does not change the optimization problem. From the less well known observation that
From all the analysis, we get
How to introduce the sparsity? And which norm is suitable to use? The goal of this paper is to get sparse PCs, then we should choose penalty making V∗∈Rp×d sparse. For matrix, there are two ways to get sparsity:
- columnwise sparsity: for matrix A , each of its column is sparse, i.e. only few elements of
A∗i are nonzero. - row sparsity: for matrix A , its rows are sparse, i.e. only few rows of
A are sparse, which produce the group sparsity.
For sparse PCA, to select the import features, this paper uses row sparsity. An intuitive penalty is ∥V∥2,0 . But in high dimensional situation, ℓ0 norm is NP hard to deal. A common trick is replacing ℓ0 with ℓ1 . Then the penalty becomes ∥V∥2,1 . But our model is function of H=VV′ . So what sparsity on H can approximate well of