【手势识别-论文学习】 Video-based Hand Manipulation Capture Through Composite Motion Control

(SIGGRAPH2013) Video-based Hand Manipulation Capture Through Composite Motion Control

这篇文章是粗读,因为该文是对视频序列的全局优化,所以实时性很差,和我项目相关性小。(后面的重建,优化过程没有细看)

文章概要:

作者希望能捕捉手在和物体交互的时候的精细动作,然后在3D环境中虚拟手和物体的交互。同时因为是对整个视频序列进行模拟,所以希望不要出现“unpleasant visual artifacts such as motion jerkiness, hand-object penetration, and improper interaction between the hand and object.”

因此步骤是:

  1. 用scanner扫描建立一个手部3D模型,16个点,28个自由度,然后每一帧都用PD控制模型去计算每一个自由度的角度和速度等。
  2. 3D模型运动以后,调整mesh然后在不同角度进行投影,形成多张摄像头视角的图像,这是假设值。
  3. 用6个摄像头捕捉实际中的图像(6个...尴尬),这是观测值。
  4. 目标就是,让观测值和假设值尽可能的匹配:所以需要一个匹配的标准,作者使用了轮廓,颜色,边缘三个标准来计算两者的匹配程度。
  5. 在全局(视频序列)上找一个全局最优解。但是因为在所有的帧上找最优解,解空间维度过高,因此作者将手和物体发生了碰撞的时刻作为分割点,这样来降低解空间的维度。
本文的demo视频效果看起来不错。另外比较赞同作者的一些观点:
  • data-driven的方法始终是找发生过的情况的相似解,不论是generative或者是discriminative的,都很难用于差异性较大的未知情况,比如说手抓不同的物体的方式是不同的,不可能把所有情况都作为样本来训练模型。因此手势识别中模型的加入是很有必要的。
  • 对于纯靠image的appearance的方式的确很难做到很好的稳定和精细,尤其是当只有两个或更少的摄像头时,遮挡问题会很严重,观测值丢失,会导致序列上手会发生jerk等现象。并且,也很难去做从image上去捕捉细微动作。

Abstract As robots become more integrated into humans' everyday lives, it becomes essential for them to perceive and understand people's intentions and actions. People will want a more natural way to interact with robots, closer to the way humans do it among themselves. When humans communicate, they use a rich set of gestures and body language in addition to speech, which significantly enhances their ability to get thoughts, feelings, and intentions across. Hence, robots strongly need the ability to understand human gestures and to interpret body language. One of the most expressive components of body language are hand gestures, which is why they are so important at the interface between humans and robots. This thesis presents an extensible, vision-based communication system that is able to interpret 2D dynamic hand gestures. The system consists of three primary modules: hand segmentation, feature extraction, and gesture recognition. Hand segmentation uses both motion and skin-color detection to separate the hand from cluttered backgrounds. The feature extraction module gathers hand trajectory information and encapsulates it into a form that is used by the gesture recognizer. This last module identifies the gesture and translates it into a command that the robot can understand. Unlike other existing work, this hand gesture recognition system operates in realtime without the use of special hardware. Results demonstrate that the system is robust in realistic, complex environments. Moreover, this system does not require a priori training and allows the hand great flexibility of motion without the use of gloves or special markers.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值