Proposed Methodology
Heatmap Regression
将预测AU intensity vector的问题转换为预测multiple AU heatmaps
Fig.2给出了每一个AU的central location
Q:每一个点都应该由68个landmarks通过某些规则计算得到的吧,文中没有仔细说明
AU数量为
N
N
N,对于第
i
i
i个AU,其location的坐标为
x
^
i
\hat{x}_i
x^i,以
x
^
i
\hat{x}_i
x^i为中心作用Gaussian function,生成heatmap
g
i
(
x
)
g_i(x)
gi(x)
g
i
(
x
)
=
I
2
π
σ
2
exp
(
−
∥
x
−
x
^
i
∥
2
2
2
σ
2
)
(
1
)
g_i(x)=\frac{I}{2\pi\sigma^2}\exp\left ( -\frac{\left \| x-\hat{x}_i \right \|_2^2}{2\sigma^2} \right ) \qquad(1)
gi(x)=2πσ2Iexp(−2σ2∥x−x^i∥22)(1)
其中
I
I
I表示AU intensity,
σ
\sigma
σ是Gaussian function的标准差
网络输出的predicted heatmap为
h
i
(
x
;
w
,
b
)
h_i(x;w,b)
hi(x;w,b),
x
x
x表示input facial image,
h
h
h和
w
w
w是网络的参数,于是求MSE作为监督损失
L
M
S
E
=
min
w
,
b
∑
i
=
1
N
∑
x
∥
h
i
(
x
;
w
,
b
)
−
g
i
(
x
)
∥
2
2
(
2
)
L_{MSE}=\underset{w,b}{\min}\sum_{i=1}^{N}\sum_{x}\left \| h_i(x;w,b)-g_i(x) \right \|_2^2 \qquad(2)
LMSE=w,bmini=1∑Nx∑∥hi(x;w,b)−gi(x)∥22(2)
由predicted heatmap转换为predicted label时,只需要取最大值即可(若某个AU intensity为0,那么predicted heatmap必然是全黑的)
SCC: Semantic Correspondence Convolution
Given the co-occurrences of different AU intensities, the semantic representations of feature maps are highly correlated in spatial distributions.
本文提出SCC来model the correlation among feature channels,其思想inspired by the dynamic graph convolutions in geometry modeling
注:Wang Y, Sun Y, Liu Z, et al. Dynamic Graph CNN for Learning on Point Clouds[J]. ACM Transactions on Graphics (TOG), 2019, 38(5): 1-12.
后续的SCC有点复杂,是参考TOG paper的
代码
数据预处理
https://github.com/EvelynFan/FAU/blob/master/functions.py#L41-L44
生成heatmap
https://github.com/EvelynFan/FAU/blob/master/model_graph.py#L110
网络定义graph
https://github.com/EvelynFan/FAU/blob/master/model_graph.py#L121