【论文阅读】Learning Safe Prediction for Semi-Supervised Regression∗

来日可期1314

已于 2023-04-11 14:13:59 修改

阅读量260

点赞数 1

分类专栏：论文阅读文章标签：论文阅读半监督回归

于 2022-08-18 16:24:44 首次发布

本文链接：https://blog.csdn.net/ssjq123/article/details/126395930

版权

论文阅读专栏收录该内容

29 篇文章 0 订阅

订阅专栏

《Learning Safe Prediction for Semi-Supervised Regression∗》
代码地址

1. 摘要

Semi-supervised learning (SSL) concerns how to improve performance via the usage of unlabeled data. Recent studies indicate that the usage of unlabeled data might even deteriorate performance. Although some proposals have been developed to alleviate such a fundamental challenge for semi- supervised classiﬁcation, the efforts on semi-supervised re- gression (SSR) remain to be limited. In this work we consider the learning of a safe prediction from multiple semi- supervised regressors, which is not worse than a direct supervised learner with only labeled data. We cast it as a geometric projection issue with an efﬁcient algorithm. Furthermore, we show that the proposal is provably safe and has already achieved the maximal performance gain, if the ground-truth label assignment is realized by a convex linear combination of base regressors. This provides insight to help understand safe SSR. Experimental results on a broad range of datasets validate the effectiveness of our proposal.

半监督学习 (SSL) 关注如何通过使用未标记数据来提高性能。最近的研究表明，使用未标记数据甚至可能会降低性能。尽管已经提出了一些建议来缓解半监督分类的这种基本挑战，但半监督回归（SSR）的努力仍然有限。在这项工作中，我们考虑从多个半监督回归器中学习安全预测，这并不比只有标记数据的直接监督学习器差。我们使用有效的算法将其转换为几何投影问题。此外，如果通过基回归量的凸线性组合来实现真实标签分配，我们证明了该提议是可证明的安全的并且已经实现了最大的性能增益。这提供了有助于理解安全 SSR 的洞察力。广泛数据集的实验结果验证了我们提议的有效性。

Thinking：

本文针对的问题算是另辟蹊径的一个新问题：从多个半监督回归器中学习安全预测，保证不比只有标记数据的直接监督学习器差？
新问题总是比新方法更好，前者创建研究方向，后者研究方法。

2. 算法描述

符号	含义
$\{f_1, \dots, f_b\}, \text{where } f_i \in \mathbb{R}^u$	$b$ 个半监督回归器对于 $u$ 个未标记样本的预测
$f_0 \in \mathbb{R}^u$	一个只是用有标记样本训练的回归器对于 $u$ 个未标记样本的预测
$f_{*}$	无标记样本的真实标签（当然这里是个完全未知量，不然也不叫无标记样本了）

$\max_{f \in \mathbb{R}^u}\sum_{i=1}^{b}\alpha_i(\|f_0-f_i\|^2-\|f-f_i\|^2)\tag{1}$

Notice:

$f$ 是输出，即 $g(\{f_1, \dots, f_b\}, f_0)$ ，是我们想得到的一个值
损失分为两个部分，前一部分 $f_0-f_i\|^2$ 是常数，由于存在权重，所以无法舍去;

由于在实际生活中，权重信息也是不可知的，所以应该对 $\alpha_i$ 进行建模。

$\max_{f \in \mathbb{R}^u}\min_{\alpha \in \mathcal{M}}\sum_{i=1}^{b}\alpha_i(\|f_0-f_i\|^2-\|f-f_i\|^2)\tag{2}$

对于等式(2)对 $f$ 导数为零，得到一个闭式解。
$\sum^{b}_{i=1}\alpha_if_i\tag{3}$

合并等式2，等式3，得到：
$\min_{\alpha \in \mathcal{M}}\|\sum_{i=1}^{b}\alpha_if_i-f_0\|^2\tag{4}$

这里想了半天，不知道是怎么合并的，头痛😂。

精彩的地方来了，作者把等式4作为一个几何投影问题来解决。

Let $\Omega = \{f| \sum_{i=1}^{b}\alpha_if_i, \alpha \in \mathcal{M}\}$ 。
等式4被重写为：
$\overline{f} = \argmin_{f \in \Omega}\|f-f_0\|^2\tag{5}$
这个式子的含义就是，在空间 $\Omega$ 中找到 $f$ 使得与 $f_0$ 的距离最短，毫无疑问这个 $\overline{f}$ 就是 $f_0$ 在空间 $\Omega$ 的投影。这个也是后面证明算法安全性的重要条件。
在这里插入图片描述
Theorem1： $\|\overline{f}-f^{*}\|^2 \leq \|f_0-f^{*}\|^2$ if the ground truth label assignment $f^{*} \in \Omega = \{f| \sum_{i=1}^{b}\alpha_if_i, \alpha \in \mathcal{M}\}$ 。