姿态识别大纲

自上而下

  • Cascaded Pyramid Network
  • Stacked Hourglass Networks
  • Alphapose
  • Simple Baseline
  • HRNet

自下而上

  • openpose

simple baseline

作者看法
          姿态识别的信息特征提取的关键在于如何将小的feature map变大。obtaining high resolution feature maps is crucial, but no matter how
综述
         our method combines the upsampling and convolutional parameters into deconvolutional layers in a much simpler way, without using skip layer connections.

在这里插入图片描述

    Our method simply adds a few deconvolutional layers over the last convolution stage in the ResNet, called C5.

Loss

Mean Squared Error (MSE) is used as the loss between
the predicted heatmaps and targeted heatmaps

知识点

转制卷积和upsample的区别

  • transpose conv
            装置卷积是将卷积操作反过来,卷积是多对以,transpose conv是一对多
  • upsample
            最近邻插值(Nearest neighbor interpolation)
            双线性插值(Bi-Linear interpolation)

heatmap 高斯分布
          The targeted heatmap H k H^k Hk for joint k is generated by applying a 2D gaussian centered on the k th joint’s groundtruth location. 大小64*64



Hrnet

hign resolution network
解释:
          Maintain high-resolution representations through the whole process

anthor view:
          Repeated multi-scale fusions such that each of the high-to-low resolution representations receives information from other parallel representations over and over, leading to rich high-resolution representations

在这里插入图片描述

          we perform repeated multiscale fusions to boost the high-resolution representations with the help of the low-resolution representations of the same depth and similar level

small net and one big net
where 32 and 48 represent the widths © of the high-resolution subnetworks in last three stages,

Multi-scale fusion

          Need a separate low-to-high upsampling process and aggregate low-level and high-level representations.

Loss

          We regress the heatmaps simply from the high-resolution representations output by the last exchange unit,



Openpose

综述
       We present an approach to efficiently detect the 2D pose of multiple people in an image. The architecture encodes global context, allowing a greedy bottom-up parsing step that maintains high accuracy while achieving realtime performance, irrespective of the number of people in the image.
       We present the first bottom-up representation of association scores via Part Affinity Fields (PAFs), a set of 2D vector fields that encode the location and orientation of limbs over the image domain.

Method
  • input
    RGB image

  • output

    • a set of 2D confidence maps S of body part locations
    • a set of 2D vector fields L of part affinities(L:每个点的方向向量 在x,y方向的分量。)

在这里插入图片描述

  • finally
    the confidence maps and the affinity fields are parsed by greedy inferenceto output the 2D keypoints for all people in the image.
    在这里插入图片描述
知识点

Part Affinity Fields (PAFs)
功能:preserves both location and orientation information across the region of support of the limb

本质:The part affinity is a 2D vector field for each limb
        2D vector encodes the direction that points from one part of the limb to the other.
        Each type of limb has a corresponding affinity field joining its two associated body parts.

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值