论文阅读：【AU Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution】

最新推荐文章于 2024-07-23 11:05:06 发布

BoeTh_00

最新推荐文章于 2024-07-23 11:05:06 发布

阅读量327

点赞数 1

分类专栏：论文阅读笔记文章标签：计算机视觉

本文链接：https://blog.csdn.net/qq_34855411/article/details/122014824

版权

论文阅读笔记专栏收录该内容

3 篇文章 1 订阅

订阅专栏

论文阅读：【AU Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution】

Summary

A new learning framework that automatically learns the latent relationships of AUs via establishing semantic correspondences between feature maps.

Heatmap regression-based network: feature maps preserve rich semantic information associated with AU intensities and locations.
GCNN: describes the intrinsic relationship between various vertex nodes of the graph by learning an adjacency matrix, to explore the relationships among multiple feature maps.
Semantic correspondence convolution module (SCC): automatically learns the semantic relationships among feature maps to discove the latent co-occurrence relationships of AU intensities.

Key Contributions:

leverage the semantic correspondence for modeling the implicit co-occurrence relations of AU intensity levels in a heatmap regression framework, where the feature maps encode rich semantic descriptions and spatial distributions of AUs.
SCC module to dynamically compute the correspondences among feature maps layer by layer.

在这里插入图片描述

Heatmap Regression

In Stream 1:
Each deconvolutional layer is followed with an SCC module that models the relationship among multiple feature maps at this specific resolution level.
In Stream 2:
The ground-truth possibility heatmap $g_i(x)$ for a predefined AU location $L_i (i = {1, . . . , N})$ is generated by applying a Gaussian function centered on its corresponding coordinate $\hat{x}_i$ ,
$g_i(x)=\frac{I}{2\pi {\sigma}^2}exp(-\frac{{||x-\hat{x}_i||}_2^2}{2{\sigma}^2})$ ,
// I: the labeled intensity of the specific AU;
// $\sigma$ : the standard deviation.
Utilize the L2 distance to minimize the difference between $h_i(x;w,b)$ (the predicted heatmap) and $g_i(x)$ , then calculate the MSE loss.

SCC: Semantic Correspondence Convolution

Aiming to model the correlation among feature channels, where each channel encodes a specific visual pattern of AU. The feature channels with similar semantic patterns would be activated simultaneously when a specific co-occurrence pattern of AU intensities emerges.
In SCC module:
– first construct the k-NN graph by grouping sets of closest feature maps to find different co-occurrence patterns;
– then apply the convolution operations on the edges that connect feature maps sharing similar semantic patterns to further exploit the edge information of the graph;
– afterwards, the aggregation function, i.e., MAX, is applied to summarize the most discriminative features for improving AU intensity estimation.

Graph Construction

The feature maps set is denoted by $F=\{ f_1,f_2,...,f_n\}\subseteq \mathbb R$ , and the size of each feature map (channel) is given by M×M.
Rearrange the M×M feature map in a feature vector with the length of L=M×M.
Construct the graph G as the k-nearest neighbor (k-NN) graph of F, and each node represents a specific feature map.
The edge feature is defined by $e_{ij}=h_{\Theta}(f_i,f_j)$ , where $h_{\Theta}:{\mathbb R}^L\times{\mathbb R}^L\rightarrow{\mathbb R}^{L'}$ is a nonlinear function with trainable parameters Θ.
Combine the global information encoded by $f_i$ , with its local neighborhood characteristics, captured by $f_j-f_i$ .
The edge feature function is formulated as, $e'_{ijk}=ReLU(\phi_k\cdot f_i+\omega_k\cdot(f_j-f_i))$ ,
// Θ: $(\phi_1,...,\phi_K,\omega_1,...,\omega_K)$ , where K is the number of filters.
For each $f_i$ , the k-NN graph is built by computing a pairwise distance matrix (calculated based on the Euclidean distance) and then taking the closest k feature maps.
Adopt a channel-wise aggregation function, i.e., MAX, to summarize the edge features, as it can capture the most salient features.
The output of the SCC module at the i-th vertex is then produced by, $f'_{ik}=\max\limits_{j:(i,j)\in E}{e'_{ijk}}$ .

Dynamic Graph Update

The dynamic graph convolutions are performed on both low and high resolution feature maps, aiming to capture the high-order AU interactions.
The SCC module can be integrated into multiple convolutional layers, and learn to semantically group similar feature channels that would be activated together for a specific co-occurrence pattern of AU intensities.

Correspondence with AU Heatmaps

The predicted heatmap $h_i$ for AU-i is computed as, $h_i=F^L\otimes W_i^L$ ,
// $\otimes$ : the tensor product;
// $F^L=\{f_1^L,...,f_C^L\}$ : the feature maps set generated from the last SCC layer;
// $W_i^L=\{w_{1i}^L,...,w_{Ci}^L\}$ , (i=1, 2, …, N): the the 1×1 filter bank for a specific AU-i.

BoeTh_00

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
论文阅读：【AU Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution】

论文阅读：【AU Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution】
复制链接

扫一扫

专栏目录