One folder should contains images of one person. Clustering on every folder is proposed through features extracted from a deep neural network. The number of clustering center of every folder is set to 2, and the center has more points will be set as main center. The scheme of removing error-labeled face images is to compare the extracted features’ distance between the image and main center. If the distance is larger than twice of the average distance of corresponding points, the image will be removed. After the clustering, most of the folders in the base set only contain single person’s face images.
一个文件夹应该包含一个人的图像。利用深度神经网络提取的特征,对每个文件夹进行聚类。每个文件夹的聚类中心个数设为2,中心有更多的点设为主中心。去除错误标记的人脸图像的方案是比较提取的人脸图像与主中心之间的距离。如果距离大于对应点平均距离的两倍,则图像将被移除。聚类后,基本集中的大部分文件夹只包含单个人的面部图像。