PCA EigenFace的一些注意点

reference:

http://laid.delanover.com/explanation-face-recognition-using-eigenfaces/

https://www.learnopencv.com/eigenface-using-opencv-c-python/

http://jmcspot.com/Eigenface/

问题1:

What is Eigenfaces?

Reshape Eigenvectors to obtain EigenFaces: The Eigenvectors so obtained will have a length of 30k if our dataset contained images of size 100 x 100 x 3. We can reshape these Eigenvectors into 100 x 100 x 3 images to obtain EigenFaces.

 

问题2:

当看到使用来计算eigenface的时候会觉得很奇怪,因为eigenvector的最常规计算方法是通过covariance matrix。对于已经normalized的NxM的人脸图像矩阵集合A(N是一幅人脸图像的pixel个数,M是矩阵集合A中人脸图像的数量),协方差矩阵C是^\frac{1}{M} {A}{A^{T}}, 但是这样产生的C是NxN维的,eigenvector的计算量会非常大,所以在各种blog上可以看到会用^\frac{1}{M} {A^{T}}{A}来计算eigenface,但是为什么可以用这样的计算方式?

解释1:

可以通过Singular Value Decomposition(SVD)来理解。

A = U\Sigma V{^T}                                                                               (1)

{A}{A^{T}} = U\Sigma {V^{T}}V{\Sigma^{T}}{U^{T}} = U({\Sigma^{T}}\Sigma){U^{T}}=U\Sigma {^2}U{^T}                (2)

{A^{T}}A = V{\Sigma^{T}}{U^{T}}U\Sigma {V^{T}} = V({\Sigma^{T}}\Sigma){V^{T}} = V\Sigma^{2}V{^T}                (3)

分析公式(1)(2)(3),我们可以知道

1. Eigenfaces是U

2. AA{^T}{A^{T}}A具有xian相同的eigenvalue

3. 通过求{A^{T}}A的eigenvalue \Sigma {^2} 和eigenvector V,带入(1)我们可以求得Eigenfaces U

总结来说,我们可以通过{A^{T}}A来转而求出Eigenfaces UU = AV\Sigma ^{-1}

 

解释2:

这个解释是论文Eigenfaces for Recognition by Matthew Turk and Alex Pentland里给出的.

 

转一篇文章  链接 http://jmcspot.com/Eigenface/

steps to Create Eigenfaces:

 

1) Load your training faces.

You can get faces from PubFig: Public Figures Face Database [7] and LFWcrop Face Dataset [8]. The second site offers a cropped version from the first site. You can also get your faces from other place.

This part is completely up to you. I am using gray scaled faces from LFWcrop Face Dataset [8] website. They are in PGM file format, which I used ShaniSoft ‘s class object to read the file for me. It is not very difficult to read the file myself, but, I prefer to use existing solutions to avoid introducing new bugs. Existing solutions usually provides a more reliable and more efficient solution than we can do it ourselves.

I also include a feature to convert color BMP, JPEG, and GIF into gray scale images, although it is slower than reading already gray scaled images.

 

2) Convert your images into column vectors.

Put your image pixels in an array and put it in a column of matrix. If you have 30 faces, you will have 30 columns. If your face has 100 gray scaled pixels, you will have 100 rows.

Now your matrix should have [Image1, Image2, Image3 …., ImageM].

 

Image1 Pixel 1Image2 Pixel 1Image3 Pixel 1Image4 Pixel 1.....ImageM Pixel 1
Image1 Pixel 2    ImageM Pixel 2
Image1 Pixel 3    ImageM Pixel 3

.....

    .....

.....

    .....
Image1 Pixel NImage2 Pixel NImage3 Pixel NImage4 Pixel N.....ImageM Pixel N

 

 

3) Calculate the mean.

Image you merged all your columns into one column. Thus, you will add all “Images Columns” pixel-by-pixel or row-by-row. And then, it is divided by M total images. You should have a single column which contains the mean pixels. This is what we called “Mean Face”.

Now your column should have:

 

Mean Image Pixel 1
Mean Image Pixel 2
Mean Image Pixel 3

.....

.....

Mean Image Pixel N

 

 

4) Reduce your matrix created in step2 using mean from step3.

For each column in matrix, subtract each image pixel value by mean pixel.

WHY? Before we can perform SVD, we need to have the data centered at origin. After that, we can perform matrix rotation and scaling shown from the above image. Matrix rotation and scaling are performed at center of the coordinate system, thus, we need to move our dataset to the center.

 

5) Calculate Covariance Matrix C

C = (A)(A_Transpose) where A is your matrix from step4.

I did this in IL Numerics Library. Remember if you are using IL Numeric, converting 2x2 array from C# will get a transpose of your original array. Make sure you double check that you have correct pixelCount-by-imageCount matrix dimension.

 

Covariance Matrix vs. Correlation Matrix

In statistics field, using correlation matrix is preferred as there is no bias to the columns. In here, we don't need to use correlation matrix because our covariance matrix is the same as correlation matrix. This is because all our columns are in the same range of 0 to 255 gray scale values. If our columns have different range, then, we need to normalize it, which is one step in correlation matrix.

 

 

6) Calculate SVD on matrix C.

There is not much to say, we use the algorithm provided from the existing library, such as svd(Matrix C). You can implement SVD yourself, but, it is not recommended. Using existing library would have a much faster, more accurate, more reliable, and less resources consuming solution than we implementing it from scratch.

In here, we use U as eigen-vector.

 

7) Alternative Methods.

Doing 5 and 6 when you have more pixels and fewer images, it will be slower. This is because we will have N2 = pixelCount * pixelCount matrix to calculate, which is much slower compare to a much smaller M2 = imageCount * imageCount matrix to calculate. Often you will have more pixels than images, thus, you can use this alternative method presented by M. Turk and A. Pentland [9]. If you have more images, then, you use step 5 and 6.

The argument is that we only need to keep M number of eigen-vectors to capture most of the features. Thus, even though we didn't use the entire N number of eigen-vectors, it is enough. In addition, often we use less eigen-vectors than M as well.

7.1)

Instead of creating covariance matrix

C = (A)(A_Transpose)

We do

C_small = (A_Transpose)(A)

7.2)

Perform SVD on C_small.

In here, use V as eigen-vector "temporarily". We need to do extra work to get the desired eigen-vectors because we didn’t use the step5 and step6.

7.3)

Create a new eigneValuesPowerNHalf by

eigneValuesPowerNHalf = eigenValues^(-0.5)

7.4)

Create new eigen-vector that we actual want.

For each eigenvectorV[i]

matrixMultiplied[i] = (A)(eigenvectorV[i])

eigenVectorU[i] = matrixMultiplied[i] / eigneValuesPowerNHalf [i]

End loop

eigenVectors U are the eigen-vectors we want.

7.5)

Normalize eigen-vectors using vector mean. This is often implicitly described in other papers.

For each eigenVectorU[i] in eigenVectorU

For each valueA in eigenVectorU[i]

sumOfSquares = sumOfSquares + valueA^2

End loop

vectorMean = SquareRoot(sumOfSquares)

eigenVectorU[i] = eigenVectorU[i] / vectorMean

End loop

Now we should have M amount of normalized eigen-vectors U.

 

8) Visualize Eigenfaces.

Now you should have normalized eigen-vectors from step6 or step7. To create the image, you need to scale it properly. The smallest value in your vector should be converted to 0. Largest value should be converted to 255. Any value in between should be scaled within the min-max range.

For example, newPixel[i] = 255 * (eigenVectorPixel[i] – min) / (max – min)

 

9) Reconstructing Faces.

First we compute weight W[i] of the new input face for each EigenVector[i].

We do this by doing dot product of new face column and eigen-vector column. It is the same as doing matrix multiplication with transpose of eigen-vector. We will get a single scalar W[i] per eigen-vector[i].

Because eigen-vectors are normalized, we simply multiply the W[i] to EigenVector[i] and added to mean face. Note that we are adding mean face because we subtracted mean face at step4. In order to get the actual face, we need to add mean face back.

In here, we can see that we can reconstruct any untrained faces using existing eigen-vectors as long as we have enough training faces and enough eigen-vectors. The orientations of the faces are not affected here.

 

10) Recognition.

We use the same technique from step9 to calculate the W weights. We obtain weights for input image and the trained images from our dataset. There are many always to calculate the distance. One simple way is using Euclidean distance where we

  1. Calculate sum of square of scaled down difference.
    • sum(square( ( difference of each W[i] ) / pixel count ))
    • Notice we scaled down the weight difference by pixel count, so larger pictures would still have the same scale.
  2. We take a square root of the sum.
  3. Divide it by square root of (amount of vectors used).
    • Similar to previous argument. When we use more eigen-vectors, the distance is scaled up by it. Thus, we need to scale it back down. Since we are using Euclidean distance, the scale is square root of dimensions.

 

We use Euclidean distance because each vector is orthogonal to each other. If the vectors are not orthogonal to each other, we cannot use this approach.

My program returns the top10 result of the images that consider being close enough to the existing trained images. The threshold is default to 0.05, but, you can change it any time you like in the provided GUI. However, I do not recommend increasing the threshold as it will return faces that are not similar enough to the input face. In the "Want Faces" row, the row contains all the images trained for the selected person. Because this program will train more than one faces per person if applicable, the program will have better accuracy on recognition.

However, note that the faces used in this project are not suitable for recognition because faces have different orientations and facial expressions. The algorithm is one of the early facial recognition techniques. It only has adequate recognition result using more constrained images, such as passport photos where the faces are facing the same direction and only have single type of facial expressions. There are other techniques to solve these types of problems, such as Fisherface[10] or Kernel Eigenfaces [11], but, that’s outside of this project’s objective.

 

Result:

My program will train the first image of a person. A person is identified by the file name. A file name with more than two character differences are consider different name. Other images are trained with probability of 20%.

 

img3.1

Faces in gray are first image of a person, thus, will always be trained.

img3.2

Faces in green are randomly sampled training faces.

img3.3

Faces in red are not trained.

 

My project supports PGM, BMP, JPEG, and GIF. There may be other image format supported, but, that's up to .NET 2.0 System.Drawing.Bitmap's capability. I recommend to use PGM because other formats are slower. If you have gray images in formats other than PGM, it will first randomly samples 10 pixels and determine if the image is in gray scale. If it is gray scaled, the program will use red channel as the gray scale values. If it is color image, it will spend more time on converting each pixel to gray scale; it is using 0.3, 0.59, 0.11 scaling factors instead of naive average. This is because human vision is more sensitive to one color over another. For more information about gray scale conversion, please refer to Wikipedia grayscale [12].

My program will train, reconstruct, and find the selected image/person. If you select gray or green faces, the program is expected to retrieve the exact same face from the training set. If you select red faces, try to select more eigenfaces if the result is not desirable.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值