MediaPipe训练自定义手势识别模型

最新推荐文章于 2024-07-22 16:07:24 发布

fukioston

最新推荐文章于 2024-07-22 16:07:24 发布

阅读量2.5k

点赞数 15

文章标签：人工智能

本文链接：https://blog.csdn.net/fukioston/article/details/135696922

版权

MediaPipe支持人脸识别、目标检测、图像分类、人像分割、手势识别、文本分类、语音分类。每个模块都有对应的模型，但是原来的模型不一定符合你的需求，比如手势识别中能识别的手势可能不是你想要的手势，这时候你可以自训练一个手势识别模型。具体步骤如下：

1.构建一个数据集

我用的方法比较土，因为我找不到我想要的手势的数据集。。。

先用视频拍手势的视频，然后在视频里面手稍微摆动，目的是为了构造不同角度的图片。(可以多叫几个人拍，效果好一点）

然后运行下面的代码，就一个构造出一个比较像样的数据集

import cv2
import os

# 视频文件路径
video_path = '61.mp4'

# 输出图片的文件夹路径
output_folder = 'lastmy/6'

# 确保输出文件夹存在
if not os.path.exists(output_folder):
    os.makedirs(output_folder)

# 打开视频文件
cap = cv2.VideoCapture(video_path)

frame_count = 0
while True:
    # 逐帧读取视频
    success, frame = cap.read()
    if not success:
        break  # 如果没有更多帧，则退出循环

    # 构建输出图片的文件名
    output_filename = os.path.join(output_folder, f'frame_{frame_count:04d}.jpg')

    # 保存帧为图片
    cv2.imwrite(output_filename, frame)

    frame_count += 1

# 释放视频对象
cap.release()

print(f'共保存了 {frame_count} 帧图片。')

2.利用Colab平台进行训练

Hand gesture recognition model customization guide | MediaPipe | Google for Developers

先进入介绍的页面，看一下代码是在干什么，然后点击run on colab就可以进入colab平台。

首先你需要上传自己的数据集。

然后就开始上传，会有点慢。上传完后可以自己创建一个文件，点击新建，再点击更多，点击Google Colaboratory就创建成功了

然后你需要先挂载磁盘，点击左边这里

进行挂载，挂载完后就可以开始按照代码进行训练了，代码非常简单

from google.colab import files
import os
import tensorflow as tf
assert tf.__version__.startswith('2')

from mediapipe_model_maker import gesture_recognizer

import matplotlib.pyplot as plt

dataset_path = "./drive/MyDrive/xx"

data = gesture_recognizer.Dataset.from_folder(
    dirname=dataset_path,
    hparams=gesture_recognizer.HandDataPreprocessingParams()
)
train_data, rest_data = data.split(0.8)
validation_data, test_data = rest_data.split(0.5)
hparams = gesture_recognizer.HParams(export_dir="exported_model2",epochs=15,batch_size=8)
options = gesture_recognizer.GestureRecognizerOptions(hparams=hparams)
model = gesture_recognizer.GestureRecognizer.create(
    train_data=train_data,
    validation_data=validation_data,
    options=options
)
loss, acc = model.evaluate(test_data, batch_size=1)
print(f"Test loss:{loss}, Test accuracy:{acc}")

训练完后直接导出gesture_recognizer.task文件，命令如下

model.export_model()
!ls exported_model
files.download('exported_model/gesture_recognizer.task')

记得改成自己的文件

导出后就可以进行手势验证了，具体代码如下：

import cv2
import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision

# 定义手势识别器相关的类
import mediapipe as mp

BaseOptions = mp.tasks.BaseOptions
GestureRecognizer = mp.tasks.vision.GestureRecognizer
GestureRecognizerOptions = mp.tasks.vision.GestureRecognizerOptions
VisionRunningMode = mp.tasks.vision.RunningMode

# Create a gesture recognizer instance with the video mode:
options = GestureRecognizerOptions(
    base_options=BaseOptions(model_asset_path='static/gesture_recognizer.task'),
    running_mode=VisionRunningMode.VIDEO)

# 创建手势识别器实例
with GestureRecognizer.create_from_options(options) as recognizer:
    # 初始化摄像头
    cap = cv2.VideoCapture(0)
    frame_count = 0  # 初始化帧计数器
    while True:
    # while cap.isOpened():
        success, frame = cap.read()
        imgRGB = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)  # cv2图像初始化
        mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=frame)
        recognition_result = recognizer.recognize_for_video(mp_image, frame_count)
        frame_count += 1
        if recognition_result:
            if recognition_result.gestures:
                t = recognition_result.gestures[0][0].category_name
            else:
                t = "none"
            print(t)
        # print(t)
        cv2.putText(frame, t, (50, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
        cv2.imshow("HandsImage", frame)  # CV2窗体
        cv2.waitKey(1)  # 关闭窗体


    # cap.release()
    # cv2.destroyAllWindows()

或者你可以用图片识别手势，具体的教程官方文档都说的很清楚了。