I am trying to code a face-recognition program in Python (I am going to apply k-nn algorithm to classify).
First of all, I converted the images into greyscale and then I created a long column vector (by using Opencv's imagedata function) with the image's pixels (128x128= 16384 features total)
So I got a dataset like the following (last column is the class label and I only showed first 7 features of the dataset instead of 16384).
176, 176, 175, 175, 177, 173, 178, 1
162, 161, 167, 162, 167, 166, 166, 2
But when I apply k-nn to this dataset, I get awkward results. Do I need to apply additional processes to this dataset instead of just converting the image to pixel representation?
Thanks.
解决方案
Usually, a face recognition pipeline needs several stages in order to be effective. Some degree of geometric normalization is critical to accuracy. You either need to manually label fiducial points and acquire a transform for each image, or automatically detect fiducial points, for which there are open source fiducial point detectors. Try opencv's getAffineTransform function. Also, lighting discrepancies can cause huge problems. You might try lighting normalization techniques (e.g., self quotient image), as they work pretty well for diffuse reflection and shadows (not so much specular reflection). For dimensionality reduction, principal components analysis (PCA) or linear discriminant analysis (LDA) are good places to start. Rather than raw pixel features, though, you might consider more meaningful features like LBP, HOG, or SIFT. Also, you will be able to attain higher accuracy than KNN with more sophisticated (although more complicated) classifiers such as SVM's.