【论文阅读】Semi-Supervised Deep Regression with Uncertainty Consistency and Variational Model Ensembling

来日可期1314

已于 2023-07-14 08:51:51 修改

阅读量296

点赞数 1

分类专栏：论文阅读文章标签：论文阅读

于 2023-05-17 21:39:03 首次发布

本文链接：https://blog.csdn.net/ssjq123/article/details/130729667

版权

论文阅读专栏收录该内容

29 篇文章 0 订阅

订阅专栏

论文下载
 GitHub
bib:

@INPROCEEDINGS{DaiLi2023Semi,
  title		= {Semi-Supervised Deep Regression with Uncertainty Consistency and Variational Model Ensembling via Bayesian Neural Networks},
  author	= {Weihang Dai and Xiaomeng Li and Kwang-Ting Cheng},
  booktitle	= {AAAI},
  year		= {2023},
  pages     = {1--10}
}

1. 摘要

Deep regression is an important problem with numerous applications.

These range from computer vision tasks such as age estimation from photographs, to medical tasks such as ejection fraction estimation from echocardiograms for disease tracking.

提出半监督回归的应用场景，年龄估计和医学任务，后续的实验也是按照这样进行的。

Semi-supervised approaches for deep regression are notably under-explored compared to classification and segmentation tasks, however.

说得太对了，半监督兴，半监督回归王。

Unlike classification tasks, which rely on thresholding functions for generating class pseudo-labels, regression tasks use real number target predictions directly as pseudo-labels, making them more sensitive to prediction quality.

半监督分类任务中的伪标签通过阈值来获取，而回归任务中的伪标签是一个实数，对于伪标签的质量更加严苛。

In this work, we propose a novel approach to semi-supervised regression, namely UncertaintyConsistent Variational Model Ensembling (UCVME), which improves training by generating high-quality pseudo-labels and uncertainty estimates for heteroscedastic regression.

这段话中的关键字很多：

UncertaintyConsistent：不确定性一致性
Variational Model：变分模型
Ensembling：集成
heteroscedastic regression：异方差回归

Given that aleatoric uncertainty is only dependent on input data by definition and should be equal for the same inputs, we present a novel uncertainty consistency loss for co-trained models.

aleatoric uncertainty：随机不确定性，指的是数据中的不确定性，与模型不相干。

Our consistency loss significantly improves uncertainty estimates and allows higher quality pseudo-labels to be assigned greater importance under heteroscedastic regression.

Furthermore, we introduce a novel variational model ensembling approach to reduce prediction noise and generate more robust pseudo-labels.

这里简单来说就是用两个模型的平均值来打伪标签，也叫集成（ensemble）。

We analytically show our method generates higher quality targets for unlabeled data and further improves training.

Experiments show that our method outperforms state-of-the-art alternatives on different tasks and can be competitive with supervised methods that use full labels.

2. 算法描述

2.1. Bayesian Neural Networks（BNN）

由于这个只是其中的方法，我在这里不会详细的介绍，只是大概了解它是什么，能做什么。

Bayesian Neural Networks从名字能知道也是一种神经网络，可以理解为一种神经网络的变体。其中最大的不同是，普通的神经网络的参数是一个常数，是一个确定的值，其输出当然也是一个确定的值。BNN中的参数则不是，它是一个变量，其输出也是一个变量。看到这里，就会有疑问，都是变量，那网络的前向过程怎么计算呢，这不是玩儿呢？这都是不用怀疑，在实际的操作中，我们是假设变量服从某一种分布来简化问题，其中正态分布比较常见。

具体的来说，在只有猫狗的数据集中，对于一张猫的图片来说，普通的神经网络会输出[0.8， 0.2]，表示有0.8的概率表示猫，有0.2的概率表示狗。而在BNN中，对于结果是分布的形式[ $\mathcal{N}(0.7, 0.1^2)$ , $\mathcal{N}(0.2, 0.01^2)$ ]，其中方差表示对于预测结果的不确定性。

2.1. UCVME

符号	意义
$\{(x_i, y_i)\}^{N}_{i=1}$	有标记数据
$\{x^{'}_{i^{'}}\}^{N^{'}}_{i^{'}=1}$	无标记数据
$f_m \text{ where } m \in \{a, b\}$	two BNNs using `Monte Carlo dropout`
$\hat{y}_{i,m}$	prediction of model $f_m$ for target label $y_i$
$\hat{z}_{i,m}$	log-uncertainty prediction $\log(\sigma^2)$ of model $f_m$ for target label $y_i$

我们将 $\sigma^2$ 表示为任意不确定性，但在实践中预测对数不确定性 $\log(\sigma^2)$ ，这通常是为了避免获得对方差的负面预测。

强行理解一波的话，就是缩小预测不确定性的值域，有点类似于标准化，值域小了，预测的准确度就会高一些。

UCVME is based on two novel ideas: enforcing aleatoric uncertainty consistency to improve uncertainty-based loss weighting, and variational model ensembling for generating high-quality pseudo-labels.

Novel ideas:

aleatoric uncertainty consistency
variational model ensembling

两者都是为了一个目标generating high-quality pseudo-labels。

heteroscedastic regression loss:
$\mathcal{L}_{reg} = \frac{1}{N}\sum_{i=1}^{N}\frac{(y_i-\hat{y}_i)^2}{2\sigma_i^2}+\frac{\ln(\sigma_i^2)}{2}\tag{1}$
值得注意的是，这个loss的表达是来自于已有的工作¹²。
最大似然:
$\begin{aligned} & \max _\theta \log p(y \mid x, \theta) \\ & =\max _\theta \sum_{i=1}^N \log p\left(y_i \mid \hat{y}_i\left(x_i, \theta\right), \sigma_i^2\left(x_i, \theta\right)\right) \\ & =\max _\theta \sum_{i=1}^N \log \mathcal{N}\left(\hat{y}_i, \sigma_i^2\right) \\ & =\max _\theta \sum_{i=1}^N \log \frac{1}{\sqrt{2 \pi \sigma_i^2}} \exp \left(-\frac{\left\|y_i-\hat{y}_i\right\|^2}{2 \sigma_i^2}\right) \\ & =\max _\theta \sum_{i=1}^N\left\{-\frac{\left\|y_i-\hat{y}_i\right\|^2}{2 \sigma_i^2}-\frac{\log \sigma_i^2}{2}-\frac{\log 2 \pi}{2}\right\} \end{aligned}$

labeled inputs:
- uncertainty consistent
  $\mathcal{L}_{unc}^{lb} = \frac{1}{N}\sum_{i=1}^{N}(\hat{z}_{i,a} - \hat{z}_{i,b})^2$
- Heteroscedastic regression loss
  $\mathcal{L}_{reg}^{lb} = \frac{1}{N}\sum_{m=a, b}\sum_{i=1}^{N}(\frac{(y_{i,m}-\hat{y}_i)^2}{2\exp(\hat{z}_{i, m})}+\frac{\hat{z}_{i,m}^2}{2})$
unlabeled inputs:
- uncertainty consistent
  $\mathcal{L}_{unc}^{ulb} = \frac{1}{N'}\sum_{m=a, b}\sum_{i=1}^{N'}(\hat{z}_{i,m} - \widetilde{z}_{i,m})^2$
  where $\widetilde{y}_i = \frac{1}{T}\sum_{t=1}^{T}\frac{\hat{y}_{i,a}^t+ \hat{y}_{i,b}^t}{2}$ .
  值得注意的是这里只是将两个模型的预测值做了平均（essenble）。
- Heteroscedastic regression loss
  $\mathcal{L}_{reg}^{ulb} = \frac{1}{N'}\sum_{m=a, b}\sum_{i=1}^{N'}(\frac{(\hat{y}_{i,m}-\widetilde{y}_i)^2}{2\exp(\widetilde{z}_{i, m})}+\frac{\widetilde{z}_{i,m}^2}{2})$
  where $\widetilde{z}_i = \frac{1}{T}\sum_{t=1}^{T}\frac{\hat{z}_{i,a}^t+ \hat{z}_{i,b}^t}{2}$ .
  值得注意的是这里只是将两个模型的预测值做了平均（essenble）。作者从biasvariance decomposition的角度证明了有效性，对标thresholding function for smoothing。

Total Loss:
$\mathcal{L} = \mathcal{L}_{reg}^{lb} + \mathcal{L}_{unc}^{lb} + \omega_{ulb}(\mathcal{L}_{reg}^{ulb}+\mathcal{L}_{unc}^{ulb})$

3. 实验

3.1. Age Estimation from Photographs

3.2. Ejection Fraction Estimation from Echocardiogram Videos

Kendall A, Gal Y. What uncertainties do we need in bayesian deep learning for computer vision?[J]. Advances in neural information processing systems, 2017, 30. ↩︎
https://zhuanlan.zhihu.com/p/568912284 ↩︎

来日可期1314

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
1
评论
【论文阅读】Semi-Supervised Deep Regression with Uncertainty Consistency and Variational Model Ensembling

提出半监督回归的应用场景，年龄估计和医学任务，后续的实验也是按照这样进行的。说得太对了，半监督兴，半监督回归王。半监督分类任务中的伪标签通过阈值来获取，而回归任务中的伪标签是一个实数，对于伪标签的质量更加严苛。UncertaintyConsistent：不确定性一致性Variational Model：变分模型Ensembling：集成heteroscedastic regression：异方差回归。
复制链接

扫一扫