人群计数：DRSAN--Crowd Counting using Deep Recurrent Spatial-Aware Network

最新推荐文章于 2023-07-04 17:23:38 发布

目睹闰土刺猹的瓜

最新推荐文章于 2023-07-04 17:23:38 发布

阅读量331

点赞数

分类专栏： Crowd Counting 文章标签：人群计数人群密度估计卷积神经网络深度学习

本文链接：https://blog.csdn.net/weixin_44585583/article/details/97615981

版权

Crowd Counting 专栏收录该内容

17 篇文章 4 订阅

订阅专栏

Goal:

estimating the total number of people in unconstrained crowded scenes.

Highlight:

**
Now there are two difficulties in the crowd counting, one is the variation of crowd scale, the other is camera perspective that causes huge appearance variations in people’s scales and rotations. In this paper, we solve the two questions.
we propose a unified neural network framework, named Deep Recurrent Spatial-Aware Network, which adaptively addresses the two issues in a learnable spatial transform module with a region-wise refinement process.

**Specifically： ** our framework incorporates a Recurrent Spatial-Aware Refinement (RSAR) module iteratively conducting two components:

i) a Spatial Transformer Network that dynamically locates an attentional region from the crowd density map and transforms it to the suitable scale and rotation for optimal crowd estimation;

ii) a Local Refinement Network that refines the density map of the attended region with residual learning.

Contribution:

**
• We provide an adaptive mode to simultaneously handle the effect of both scale and rotation variation by introducing a spatial transform module for crowd counting. To the best of our knowledge, we are the first to address the issue of the rotation variation on this task.

• We propose a novel deep recurrent spatial-aware network framework to recurrently select a region (with learnable scale and rotation parameters) from an initial density map for refinement, dependent on feature warping and residual learning.

Architecture:

**
including a Global Feature Embedding (GFE) module and a Recurrent Spatial-Aware Refinement (RSAR) module. Specifically, the GFE module takes the whole image as input for global feature extraction, which is further used to estimate an initial crowd density map. And then the RSAR module is applied to iteratively locate image regions with a spatial transformer-based attention mechanism and refine the attended density map region with residual learning

在这里插入图片描述

There are two models in the architecture: GFE and RSAR

Global Feature Embedding：
Goal: transform the input image into high-dimensional feature maps, which is further used to generate an initial crowd density map of the image.

GFE module is composed of three columns of CNNs, each of which has seven convolutional layers with different kernel sizes and channel numbers as well as three max-pooling layers.
Given an image I, we extract its global feature g by feeding it into GFM and concatenating the outputs of all the columns. After obtaining the global feature g, we generate the initial crowd density map M0 of image I using a convolutional layer with a kernel size of 1 × 1.
在这里插入图片描述

Recurrent Spatial-Aware Refinement:

Recurrent Attentive Refinement (RSAR) module to iteratively refine the crowd density map. Our proposed RSAR consists of two alternately performed components:
i) a Spatial Transformer Network dynamically locates an attentional region from the crowd density map;
ii) a Local Refinement Network refines the density map of the selected region with residual learning.
A high-quality crowd density map with accurately estimated crowd number would be acquired after a refinement of n iterations.

The two architecture:
在这里插入图片描述

目睹闰土刺猹的瓜

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
人群计数：DRSAN--Crowd Counting using Deep Recurrent Spatial-Aware Network

Goal:estimating the total number of people in unconstrained crowded scenes.**Highlight:**Now there are two difficulties in the crowd counting, one is the variation of crowd scale, the other is ca...
复制链接

扫一扫

专栏目录