introduction
- FER systems can be divided into two main categories according to the feature representations: static image FER and dynamic sequence FER. (时空信息)
- The majority of the traditional methods have used handcrafted features or shallow learning (e.g., local binary patterns (LBP) [12], LBP on three orthogonal planes (LBP-TOP) [15], non-negative matrix factorization (NMF) [19] and sparse learning [20]) for FER.
However, many competitions have collected relatively sufficient training data from challenging real-world scenarios,in
the meanwhile, due to the dramatically increased chip processing abilities (e.g., GPU units) and well-designed network architecture, studies in various fields have begun to transfer to deep learning methods.
database
deep facial expression recognition
1.pre-processing
-
face alignment(detector and to coordinate localized landmarks)
- Kim et al. [76] considered different inputs (original image and histogram equalized image) and different face detection models (V&J [72] and MoT [56]), and the landmark set with the highest confidence provided by the Intraface [73] was selected.
-
data augmentation(enlarge database)
- Data augmentation techniques can be divided into two groups: on-the-fly data augmentation and offline data
augmentation. - Usually, the on-the-fly data augmentation is embedded in deep learning toolkits to alleviate overfitting. During the training step, the input samples are randomly cropped from the four corners and center of the image and then flipped horizontally.
- Besides the elementary on-the-fly data augmentation, various offline data augmentation operations have been designed to further expand data on both size and diversity. The most frequently used operations include random perturbations and transforms, e.g., rotation, shifting, skew, scaling, noise, contrast and color jittering. Furthermore, deep learning based technology can be applied for data augmentation. For example,CNN or GAN(generatie adversatial network).
-
face normalization(to ameliorate illumination and head pose)
- illumination normalization
a.sevearal algorithms:isotropic diffusion (IS)-based normalization, discrete cosine transform (DCT)-based normalization [85] and difference of Gaussian (DoG)
b.homomorphic filtering based normalization & histogram equalization combined with illumination normalization etc.
c.weighted summation approach to combine histogram equalization and linear mapping ( to solve overemphasizing local contrast problem)
d.global equalization(GCN), local normalization and histogram equalization.
- pose normalization
a.Specifically, after localizing facial landmarks, a 3D texture reference model generic to all faces is generated to efficiently estimate visible facial components. T