Facial keypoints detection Kaggle 竞赛系列

3.2# Facial keypoints detection

该题主要任务是检测面部关键点位置

Detect the location of keypoints on face images

问题表述

在本问题中,要求计算面部关键点的位置,即关键点在图片中的百分比坐标。
因此该问题的机理就是 [0, 1] 范围内的数值拟合,当然了,这也是一个多输出的拟合的问题。

给定图片与其相应的 30 个标签的百分比位置,标签信息如下:

123
left_eye_center_xleft_eye_center_yright_eye_center_x
right_eye_center_yleft_eye_inner_corner_xleft_eye_inner_corner_y
left_eye_outer_corner_xleft_eye_outer_corner_yright_eye_inner_corner_x
right_eye_inner_corner_yright_eye_outer_corner_xright_eye_outer_corner_y
left_eyebrow_inner_end_xleft_eyebrow_inner_end_yleft_eyebrow_outer_end_x
left_eyebrow_outer_end_yright_eyebrow_inner_end_xright_eyebrow_inner_end_y
right_eyebrow_outer_end_xright_eyebrow_outer_end_ynose_tip_x
nose_tip_ymouth_left_corner_xmouth_left_corner_y
mouth_right_corner_xmouth_right_corner_ymouth_center_top_lip_x
mouth_center_top_lip_ymouth_center_bottom_lip_xmouth_center_bottom_lip_y

其中标签完整的图片有 2140 张,其中,图片的大小为 96*96 pixels。

求解方案

求解步骤如下:
Step 1. 选择拟合器 SVR/KernelRidge 以及对应的 kernel
Step 2. 交叉验证实验选择超参数,超参数的选择通过枚举的方法
Step 3. 选定超参数后,用所有训练集训练拟合器
Step 4. 对测试集做预测,并输出结果

实验结果

结果 First idea:

Using 30 fitter to fit 30 labels, then I got 3.48060 RMSE

Second idea
Using 1 fitter to fit 30 labels, then I got 3.43998 RMSE[Better]
Third idea
Adding symmetrical training data, then resulting in abnormal result, such as position was greater then 96.
So, I can see that the result of fitting is only cover [0,96](or [0,1])

备注

超参数选择 gamma
for G in G_para:
       scores = list()
       for i in range(3):
           X1, X2, y1, y2 = train_test_split(train_X, train_y, test_size=0.3, random_state=42)
           clf = KernelRidge(kernel='rbf', gamma=G, alpha=1e-2)
           pred = clf.fit(X1, y1).predict(X2)
           sco = calbais(pred, y2)
           scores.append(sco)
       print('G:', G, 'Score:', scores)
30 个拟合器超参数调试的方法与结果如下:
拟合器 KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2)
0.7:0.3 训练集划分拟合误差:
[0] 0.7792    [10] 0.9744    [20] 1.0985
[1] 0.6383    [11] 0.7451    [21] 1.2300
[2] 0.7714    [12] 0.9513    [22] 1.2636
[3] 0.6482    [13] 0.9299    [23] 1.1784
[4] 0.7355    [14] 1.0870    [24] 1.2469
[5] 0.6005    [15] 1.1898    [25] 1.2440
[6] 0.9636    [16] 0.9012    [26] 0.9444
[7] 0.7063    [17] 0.9462    [27] 1.3718
[8] 0.7214    [18] 1.1349    [28] 0.9961
[9] 0.6089    [19] 1.1669    [29] 1.5076
pandas usage:
数据统计:DataFrame.count()
数据去缺失项:DataFrame.dropna()
字符串分割:Series = Series.apply(lambda im: numpy.fromstring(im, sep=' '))
值得注意的地方:
镜像图片,似乎对本问题采用 kernel ridge 拟合器 的求解没有帮助。
Conclusion
The 30 fitter is replaced by the only 1 fitter. The grade is better.

源码

import pandas as pd
import numpy as np
import csv as csv
import matplotlib.pyplot as plt
from sklearn.utils import shuffle
from sklearn.svm import SVR
from sklearn.kernel_ridge import KernelRidge
from sklearn.cross_validation import cross_val_score, train_test_split

train_file = 'training.csv'         # 训练集数据
test_file = 'test.csv'              # 测试集数据 1783 张图片
test_type = 'IdLookupTable.csv'     # 测试集样表 行号, 图编号, 标签名

pd.set_option('chained_assignment',None)

# csv 数据读取,返回 df (pandas)
def csvFileRead(filename):
    print('Loading', filename)
    df = pd.read_csv(filename, header=0, encoding='GBK')
    print('Loaded')

    # 缺失项数据删除
    if 'train' in filename:
        df = df.dropna()
    ''' 数据查看
    print('\n数据表尺寸: ', df.values.shape)
    print('类别统计:\n')
    print(df.count(), '\n') 
    '''
    return df

# 结果存储
def csvSave(filename, ids, predicted):
    with open(filename, 'w') as mycsv:
        mywriter = csv.writer(mycsv)
        mywriter.writerow(['RowId','Location'])
        mywriter.writerows(zip(ids, predicted))

# 训练集数据预处理
def preTrain():
      print('-----------------Training reading...-----------------')
      df = csvFileRead(train_file)

    print('Image: str -> narray')
    df.Image = df.Image.apply(lambda im: np.fromstring(im, sep=' '))
    print('Image transfered.\n')

    # problem: 7049*9046 MemoryError -> df.dropna()
    X = np.vstack(df.Image.values) / 255.
    X.astype(np.float32)

    y = df[df.columns[:-1]].values
    y = (y-48)/48.
    y = y.astype(np.float32)
    '''
    # 加入人工镜像图片
    print('加入人工镜像图片...')
    X, y = imageSym(X, y)
    '''
    X, y = shuffle(X, y, random_state=42)

    yd = dict()
    for i in range(len(df.columns[:-1].values)):
        yd[df.columns[i]] = i

    return X, y, yd

# 预测集数据预处理
def preTest():
    print('-----------------Test reading...-----------------')
    df = csvFileRead(test_file)

    print('Image: str -> narray')
    df.Image = df.Image.apply(lambda im: np.fromstring(im, sep=' '))
    print('Image transfered.\n')
    # 测试集图像
    X = np.vstack(df.Image.values) / 255.
    X.astype(np.float32)

    # 预测内容:行号, 图编号, 标签名
    df = csvFileRead(test_type)
    RowId = df.RowId.values
    ImageId = df.ImageId.values - 1
    FeatureName = df.FeatureName.values

    return RowId, ImageId, FeatureName, X

# 人工特征:镜像图片
def imageSym(X, y):
    nX = np.zeros(X.shape)
    ny = np.zeros(y.shape)
    for i in range(X.shape[0]):
        temp = X[i,:].reshape(96, 96)
        temp = temp[:,::-1]
        nX[i,:] = temp.reshape(-1)
        ny[i,0::2] = -y[i,0::2]
        ny[i,1::2] = y[i,1::2]
    X = np.vstack((X, nX))
    y = np.vstack((y, ny))
    return X, y      

# 30 个拟合器进行拟合
def modelfit(train_X, train_y, test_X, yd, ImageId, FeatureName):
    #There are fitting codes.

    # 30 个拟合器对应 1 个位置
    n_clf = 30
    clfs = [
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2),
        KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2), KernelRidge(kernel='rbf', gamma=2e-4, alpha=1e-2)]

    print('-----------------开始训练...------------------')

    # 超参数
    C_para = np.logspace(-2, 4, 7)      # SVR.C
    G_para = np.logspace(-4, -3, 6)     # kernel = 'rbf'.gamma
    A_para = np.logspace(-3, 1, 5)      # KernelRidge.alpha

    # 训练
    for i in range(n_clf):
        print('Training', i, 'clf...')
        clfs[i].fit(train_X, train_y[:,i])

    # 打印训练误差
    predict = np.zeros([train_y.shape[0], 30]).astype(np.float32)
    for i in range(n_clf):
        predict[:,i] = clfs[i].predict(train_X)
    print(calbais(predict, train_y))
    print()

    print('-----------------开始预测...------------------')

    # 预测
    pred = np.zeros([test_X.shape[0], 30]).astype(np.float32)
    for i in range(n_clf):
        pred[:,i] = clfs[i].predict(test_X)
    predicted = np.zeros(len(FeatureName))
    for i in range(len(FeatureName)):
        if i % 500 == 0:
            print('i =', i)
        else:
            pass
        imageID = ImageId[i]
        clfID = yd[FeatureName[i]]
        predicted[i] = pred[imageID, clfID]
    predicted = predicted*48.+48.

    return predicted

# 单一拟合器,同时对 30 个标签做拟合
def modelfitOne(train_X, train_y, test_X, yd, ImageId, FeatureName):
    n_clf = 1
    # 拟合器
    clf = KernelRidge(kernel='rbf', gamma=6e-4, alpha=2e-2)
    # 训练
    print('-----------------开始训练...------------------')
    clf.fit(train_X, train_y)
    # 预测
    print('-----------------开始预测...------------------')
    pred = clf.predict(test_X)
    predicted = np.zeros(len(FeatureName))
    for i in range(len(FeatureName)):
        if i % 500 == 0:
            print('i =', i)
        else:
            pass
        imageID = ImageId[i]
        clfID = yd[FeatureName[i]]
        predicted[i] = pred[imageID, clfID]
    predicted = predicted*48.+48.
    return predicted

# 均方根计算方法
def calbais(pred, y2):
    y_diff = pred - y2
    y_diff = y_diff.reshape(-1)
    sco = np.linalg.norm(y_diff)/(len(y2)**0.5)
    return sco

# 参数选择的调试函数

# 超参数调试 X-y
def testfit(clf, train_X, train_y):
    scores = list()
    for i in range(3):
        X1, X2, y1, y2 = train_test_split(train_X, train_y, test_size=0.3, random_state=42)
        pred = clf.fit(X1, y1).predict(X2)
        sco = calbais(pred, y2)
        scores.append(sco)
    print(scores)

# 测试图
def plotface(x, y):
    img = x.reshape(96, 96)
    plt.imshow(img, cmap='gray')
    y = y * 48 + 48
    plt.scatter(y[0::2], y[1::2], marker='x', s=20)
    plt.show()

# 训练集数据读取
df = csvFileRead(train_file)
train_X, train_y, yd = preTrain()

# 测试集数据读取
RowId, ImageId, FeatureName, test_X = preTest()

# 1) 数据拟合: 30 个拟合器
predicted = modelfit(train_X, train_y, test_X, yd, ImageId, FeatureName)

# 2) 数据拟合: 1 个拟合器
predicted = modelfitOne(train_X, train_y, test_X, yd, ImageId, FeatureName)

# 结果存储
csvSave('KernelRidge.csv', np.linspace(1, len(predicted), len(predicted)).astype(int), predicted)
  • 4
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 11
    评论
HRNet是一种用于面部关键点检测的人工智能模型。面部关键点是面部的几个具有重要意义的特定点,例如眼睛、鼻子、嘴巴等。HRNet采用高分辨率表示的思想,通过构建一个多分辨率的深度网络来提取不同层次的特征,从而提高了模型对细节的感知能力。 HRNet-Facial-Landmark-Detection是基于HRNet的面部关键点检测模型。它通过先对输入图像进行预处理,将图像转换为HRNet网络能够处理的格式,然后通过多层次的卷积神经网络提取图像中的特征。这些特征包含了面部关键点的信息,然后通过一个后续的全连接层将这些特征映射到最终的关键点位置。 HRNet-Facial-Landmark-Detection具有准确度高、鲁棒性强的优点。它可以在低光、遮挡等复杂环境下,准确地定位面部关键点。因此,HRNet-Facial-Landmark-Detection在人脸识别、表情识别、虚拟现实等领域具有广泛的应用前景。 需要注意的是,HRNet-Facial-Landmark-Detection的性能受到输入图像质量和数据集的限制。如果输入图像质量较差或数据集中没有涵盖模型需要的样本多样性,可能会降低模型的准确度。此外,模型的训练和测试过程需要耗费大量的计算资源和时间。 总之,HRNet-Facial-Landmark-Detection是一种高效、准确的面部关键点检测模型,它可以在复杂环境下准确地定位人脸的关键点位置。它的应用领域广泛,有助于改进人脸识别、表情识别和虚拟现实等技术。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 11
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值