感谢恩培大佬对项目进行了完整的实现,并将代码进行开源,供大家交流学习。
一、项目简介
本项目最终达到的效果为手势控制操作鼠标。如下所示
项目用python实现,调用opencv,mediapipe,pytorch等库,由以下步骤组成:
1、使用OpenCV读取摄像头视频流;
2、识别手掌关键点像素坐标;
3、根据识别得到的手掌关键点信息,以图的方式构建数据结构;
4、用Pytorch提供的GCN图卷积神经网络训练数据并手势进行分类;
二、知识拆解
1、mediapipe
mediapipe是谷歌推出的一个深度学习常用功能库。封装了人脸检测,物体检测,语义分割,运动追踪等常用功能,并且支持Android、IOS、C++、Python多种平台和版本。安装简单,调用方便。
下面演示以下python版本,手指检测的使用方式。
安装库:
pip install mediapipe
调用示例:
import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_hands = mp.solutions.hands
# For static images:
IMAGE_FILES = []
with mp_hands.Hands(
static_image_mode=True,
max_num_hands=2,
min_detection_confidence=0.5) as hands:
for idx, file in enumerate(IMAGE_FILES):
# Read an image, flip it around y-axis for correct handedness output (see
# above).
image = cv2.flip(cv2.imread(file), 1)
# Convert the BGR image to RGB before processing.
results = hands.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
# Print handedness and draw hand landmarks on the image.
print('Handedness:', results.multi_handedness)
if not results.multi_hand_landmarks:
continue
image_height, image_width, _ = image.shape
annotated_image = image.copy()
for hand_landmarks in results.multi_hand_landmarks:
print('hand_landmarks:', hand_landmarks)
print(
f'Index finger tip coordinates: (',
f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].x * image_width}, '
f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * image_height})'
)
mp_drawing.draw_landmarks(
annotated_image,
hand_landmarks,
mp_hands.HAND_CONNECTIONS,
mp_drawing_styles.get_default_hand_landmarks_style(),
mp_drawing_styles.get_default_hand_connections_style())
cv2.imwrite(
'/tmp/annotated_image' + str(idx) + '.png', cv2.flip(annotated_image, 1))
# Draw hand world landmarks.
if not results.multi_hand_world_landmarks:
continue
for hand_world_landmarks in results.multi_hand_world_landmarks:
mp_drawing.plot_landmarks(
hand_world_landmarks, mp_hands.HAND_CONNECTIONS, azimuth=5)
# For webcam input:
cap = cv2.VideoCapture(0)
with mp_hands.Hands(
model_complexity=0,
min_detection_confidence=0.5,
min_tracking_confidence=0.5) as hands:
while cap.isOpened():
success, image = cap.read()
if not success:
print("Ignoring empty came