学习 MediaPipe 手部检测和手势识别

1 手部检测

1.0 Demo

import time
import cv2
import mediapipe as mp


mpHands = mp.solutions.hands
hands = mpHands.Hands(model_complexity=0)
mpDraw = mp.solutions.drawing_utils

cap = cv2.VideoCapture(0)

timep, timen = 0, 0

while True:
    ret, img = cap.read()

    if ret:
        img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        result = hands.process(img_rgb)

        if result.multi_hand_landmarks:
            for handLms in result.multi_hand_landmarks:
                mpDraw.draw_landmarks(img, handLms, mpHands.HAND_CONNECTIONS)

        timen = time.time()
        fps = 1/(timen-timep)
        timep = timen

        cv2.putText(img, f"FPS: {fps:.2f}", (50, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 1, cv2.LINE_AA)

        cv2.imshow("IMG", img)
    
    if cv2.waitKey(1) == ord('q'):
        break

1.1 mediapipe.solutions.hands.Hands

首先看看 MediaPipe 中的 Hands 类。

class Hands(SolutionBase):
  """MediaPipe Hands.

  MediaPipe Hands processes an RGB image and returns the hand landmarks and
  handedness (left v.s. right hand) of each detected hand.

  Note that it determines handedness assuming the input image is mirrored,
  i.e., taken with a front-facing/selfie camera (
  https://en.wikipedia.org/wiki/Front-facing_camera) with images flipped
  horizontally. If that is not the case, use, for instance, cv2.flip(image, 1)
  to flip the image first for a correct handedness output.

  Please refer to https://solutions.mediapipe.dev/hands#python-solution-api for
  usage examples.
  """

MediaPipe 提供的 Hands 类,处理 RGB 图片,并返回检测到的每只手的关节点(手地标,handlandmarks)和手性(左/右手,handedness)。

注意:图像水平翻转会影响手性的识别。先使用 cv2.flip(image, 1) 水平翻转图片,可以获得正确的手性。

1.1.1 Hands 初始化

Hands 接收 5 个初始化参数:

  1. static_image_mode:静态图片输入模式,默认值为 False。是否将输入图片视为一批不相关的静态图片。
  2. max_num_hands:识别手掌的最大数目,默认值为 2。
  3. model_complexity:模型复杂度,默认值为 1,取值 0/1。值越大,模型越复杂,识别越精确,耗时越久。
  4. min_detection_confidence:最低检测置信度,默认值为 0.5,取值 0.0 ~ 1.0。值越大,对手掌筛选越精确,越难识别出手掌,反之越容易误识别。
  5. min_tracking_confidence:最低追踪置信度,默认值为 0.5,取值 0.0 ~ 1.0。值越大,对手掌追踪筛选越精确,越容易跟丢手掌,反之越容易误识别。
  def __init__(self,
               static_image_mode=False,
               max_num_hands=2,
               model_complexity=1,
               min_detection_confidence=0.5,
               min_tracking_confidence=0.5):
    """Initializes a MediaPipe Hand object.

    Args:
      static_image_mode: Whether to treat the input images as a batch of static
        and possibly unrelated images, or a video stream. See details in
        https://solutions.mediapipe.dev/hands#static_image_mode.
      max_num_hands: Maximum number of hands to detect. See details in
        https://solutions.mediapipe.dev/hands#max_num_hands.
      model_complexity: Complexity of the hand landmark model: 0 or 1.
        Landmark accuracy as well as inference latency generally go up with the
        model complexity. See details in
        https://solutions.mediapipe.dev/hands#model_complexity.
      min_detection_confidence: Minimum confidence value ([0.0, 1.0]) for hand
        detection to be considered successful. See details in
        https://solutions.mediapipe.dev/hands#min_detection_confidence.
      min_tracking_confidence: Minimum confidence value ([0.0, 1.0]) for the
        hand landmarks to be considered tracked successfully. See details in
        https://solutions.mediapipe.dev/hands#min_tracking_confidence.
    """
    super().__init__(
        binary_graph_path=_BINARYPB_FILE_PATH,
        side_inputs={
            'model_complexity': model_complexity,
            'num_hands': max_num_hands,
            'use_prev_landmarks': not static_image_mode,
        },
        calculator_params={
            'palmdetectioncpu__TensorsToDetectionsCalculator.min_score_thresh':
                min_detection_confidence,
            'handlandmarkcpu__ThresholdingCalculator.threshold':
                min_tracking_confidence,
        },
        outputs=[
            'multi_hand_landmarks', 'multi_hand_world_landmarks',
            'multi_handedness'
        ])

在此基础上,Hands 的父类还接收 1 个常数 _BINARYPB_FILE_PATH

_BINARYPB_FILE_PATH = 'mediapipe/modules/hand_landmark/hand_landmark_tracking_cpu.binarypb'

1.1.2 process 检测

函数 process 接收 RGB 格式的 numpy 数组,返回包含 3 个字段的 具名元组(NamedTuple):

  1. multi_hand_landmarks:每只手的关节点坐标。
  2. multi_hand_world_landmarks:每只手的关节点在真实世界中的3D坐标(以 米m 为单位),原点位于手的近似几何中心。
  3. multi_handedness:每只手的手性(左/右手)。
  def process(self, image: np.ndarray) -> NamedTuple:
    """Processes an RGB image and returns the hand landmarks and handedness of each detected hand.

    Args:
      image: An RGB image represented as a numpy ndarray.

    Raises:
      RuntimeError: If the underlying graph throws any error.
      ValueError: If the input image is not three channel RGB.

    Returns:
      A NamedTuple object with the following fields:
        1) a "multi_hand_landmarks" field that contains the hand landmarks on
           each detected hand.
        2) a "multi_hand_world_landmarks" field that contains the hand landmarks
           on each detected hand in real-world 3D coordinates that are in meters
           with the origin at the hand's approximate geometric center.
        3) a "multi_handedness" field that contains the handedness (left v.s.
           right hand) of the detected hand.
    """

    return super().process(input_data={'image': image})

查看函数返回值的类型:

print(type(result))
print(result)
<class 'type'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>

解析 multi_hand_landmarks,返回的坐标值为相对图片的归一化后的坐标。

print(type(result.multi_hand_landmarks))
print(result.multi_hand_landmarks)

for handLms in result.multi_hand_landmarks:
    print(type(handLms))
    print(handLms)
    print(type(handLms.landmark))
    print(handLms.landmark)
    
    for index, lm in enumerate(handLms.landmark):
        print(type(lm))
        print(lm)
        print(type(lm.x))
        print(index, lm.x, lm.y, lm.z)
 # result.multi_hand_landmarks
<class 'list'>
[landmark {
  x: 0.871795416
  y: 1.01455748
  z: 1.16892895e-008
}
...]

# handLms 
<class 'mediapipe.framework.formats.landmark_pb2.NormalizedLandmarkList'>
landmark {
  x: 0.871795416
  y: 1.01455748
  z: 1.16892895e-008
}
...

# handLms.landmark
<class 'google._upb._message.RepeatedCompositeContainer'>
[x: 0.871795416
y: 1.01455748
z: 1.16892895e-008
,
...]

# lm
<class 'mediapipe.framework.formats.landmark_pb2.NormalizedLandmark'>
x: 0.871795416
y: 1.01455748
z: 1.16892895e-008

# lm.x
<class 'float'>
0 0.8717954158782959 1.0145574808120728 1.1689289536320757e-08

解析 multi_hand_world_landmarks,结构与 multi_world_landmarks 相同,区别在于单位为 m。

# handWLms
<class 'mediapipe.framework.formats.landmark_pb2.LandmarkList'>
# lm
<class 'mediapipe.framework.formats.landmark_pb2.Landmark'>

解析 multi_handedness,返回:序号、置信度、手性。

print(type(result.multi_handedness))
print(result.multi_handedness)

for handedness in result.multi_handedness:
    print(type(handedness))
    print(handedness)
    print(type(handedness.classification))
    print(handedness.classification)
    
    for index, cf in enumerate(handedness.classification):
        print(type(cf))
        print(cf)
        print(type(cf.index))
        print(cf.index)
        print(type(cf.score))
        print(cf.score)
        print(type(cf.label))
        print(cf.label)
# result.multi_handedness
<class 'list'>
[classification {
  index: 1
  score: 0.71049273
  label: "Right"
}
...]

# handedness
<class 'mediapipe.framework.formats.classification_pb2.ClassificationList'>
classification {
  index: 1
  score: 0.71049273
  label: "Right"
}
...

# handedness.classification
<class 'google._upb._message.RepeatedCompositeContainer'>
[index: 1
score: 0.71049273
label: "Right",
...]

# cf
<class 'mediapipe.framework.formats.classification_pb2.Classification'>
index: 1
score: 0.71049273
label: "Right"

# cf.index
<class 'int'>
1
# cf.score
<class 'float'>
0.710492730140686
# cf.label
<class 'str'>
Right

========== 2024/08/04 学习中 ==========

  • 2
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

猎猫骑巨兽

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值