PCA和Kmean在人脸重建olivetti人脸数据集上的比较

骆驼穿针眼

已于 2024-07-23 13:57:13 修改

阅读量934

点赞数 28

分类专栏：计算机视觉与深度学习文章标签：计算机视觉人工智能

于 2024-07-23 09:34:42 首次发布

本文链接：https://blog.csdn.net/weixin_55982578/article/details/140625467

版权

文章目录

Olivetti 数据集

关于 Olivetti 数据集的简要信息：

拍摄时间和背景
- 人脸图像拍摄时间：1992年4月到1994年4月之间。
- 拍摄背景：所有人脸图像的背景均为黑色。
图像数量和人物
- 数据集包含40个不同人物的图像。
- 每个人有10张不同的图像。
- 总共有400张人脸图像。
图像的多样性
- 图像是在不同时间拍摄的。
- 光照条件各不相同。
- 面部表情和面部细节各异。
图像特征
- 图像为灰度级。
- 每张图像的大小为64x64像素。
- 图像的像素值被缩放到[0, 1]区间。
人物编码
- 数据集中40个人的姓名被编码为0到39之间的整数。
  kaggle 下载地址：
  https://www.kaggle.com/datasets/imrandude/olivetti

文末提供完整代码和下载好的数据集

加载数据集

import numpy as np
data=np.load("/content/drive/MyDrive/faces_dataset/olivetti_faces.npy")
target=np.load("/content/drive/MyDrive/faces_dataset/olivetti_faces_target.npy")

打印出数据集里面的信息

print("There are {} images in the dataset".format(len(data)))
print("There are {} unique targets in the dataset".format(len(np.unique(target))))
print("Size of each image is {}x{}".format(data.shape[1],data.shape[2]))
print("Pixel values were scaled to [0,1] interval. e.g:{}".format(data[0][0,:4]))

There are 400 images in the dataset
There are 40 unique targets in the dataset
Size of each image is 64x64
Pixel values were scaled to [0,1] interval. e.g:[0.30991736 0.3677686 0.41735536 0.44214877]

打印出里面的标签

print("unique target number:",np.unique(target))
#unique target number: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39]

根据标签展示出10个人脸和其对应的图片

def show_10_faces_of_n_subject(images, subject_ids):
    cols=10# each subject has 10 distinct face images
    rows=(len(subject_ids)*10)/cols #
    rows=int(rows)

    fig, axarr=plt.subplots(nrows=rows, ncols=cols, figsize=(18,9))
    #axarr=axarr.flatten()

    for i, subject_id in enumerate(subject_ids):
        for j in range(cols):
            image_index=subject_id*10 + j
            axarr[i,j].imshow(images[image_index], cmap="gray")
            axarr[i,j].set_xticks([])
            axarr[i,j].set_yticks([])
            axarr[i,j].set_title("face id:{}".format(subject_id))
#You can playaround subject_ids to see other people faces
show_10_faces_of_n_subject(images=data, subject_ids=[1,4, 25, 20, 39])

在这里插入图片描述
将图像重新调整以适应机器学习模型

#We reshape images for machine learnig  model
X=data.reshape((data.shape[0],data.shape[1]*data.shape[2]))
print("X shape:",X.shape) #X shape: (400, 4096)

分成训练集和车市集和验证集

from sklearn.model_selection import train_test_split

# 假设已经有特征数据 X 和标签数据 target

# 将数据划分为训练集和临时集（70%训练集，30%临时集）
X_train_temp, X_test_val, y_train_temp, y_test_val =

最低0.47元/天解锁文章

骆驼穿针眼

关注

28
点赞
踩
12

收藏

觉得还不错? 一键收藏
0
评论
PCA和Kmean在人脸重建olivetti人脸数据集上的比较

KNN（K近邻）算法是最简单且常用的分类算法之一。它属于有监督学习算法，尽管看起来与无监督学习算法K-means相似，但两者本质上不同。那么，什么是KNN算法呢？接下来我们进行介绍。KNN的全称是K Nearest Neighbors，即K个最近的邻居。其核心思想是：在预测新样本x时，根据其在特征空间中距离最近的K个邻居的类别来判断x的类别。PCA（Principal Component Analysis），即主成分分析法，是特征降维最常用的手段之一，也是最基础的无监督降维算法。
复制链接

扫一扫

专栏目录