TrustGeo参文22：Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions

最新推荐文章于 2024-10-08 13:54:11 发布

路由跳变

最新推荐文章于 2024-10-08 13:54:11 发布

阅读量1.1k

点赞数 19

分类专栏： TrustGeo 文章标签：人工智能

本文链接：https://blog.csdn.net/sinat_41942180/article/details/137400752

版权

本文提出了一种新的可信多模态回归算法，利用正态-逆伽马分布混合（MoNIG）有效估计不确定性，提升预测准确性和可信度。该模型能动态识别模态噪声，对损坏模态具有鲁棒性，适用于成本敏感领域，如超导温度预测、CT切片定位和多模态情绪分析。

摘要由CSDN通过智能技术生成

与正态-逆伽马分布混合的可信多模态回归

35th Conference on Neural Information Processing Systems (NeurIPS 2021), Sydney, Australia.

[22] Huan Ma, Zongbo Han, Changqing Zhang, Huazhu Fu, Joey Tianyi Zhou, and Qinghua Hu. 2021. Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions. Advances in Neural Information Processing Systems 34 (2021), 6881–6893.

Abstract

多模态回归是一项基本任务，它整合了来自不同来源的信息，以提高后续应用程序的性能。然而，现有的方法主要侧重于提高性能，而往往忽略了对不同情况的预测的置信度。在本研究中，我们致力于可信的多模态回归，这是至关重要的成本敏感领域。为此，我们引入了一种新的正态-逆伽马分布混合算法（MoNIG）算法，该算法有效地估计了不同模态自适应积分的不确定性，并得到了可靠的回归结果。我们的模型可以动态地感知每个模态的不确定性，也可以对损坏的模态具有鲁棒性。此外，所提出的MoNIG分别确保了明确地表示（模式特定的/全局的）认知不确定性和任意不确定性。在合成数据和不同的真实数据上的实验结果表明，我们的方法在各种多模态回归任务（如超导温度预测、CT切片的相对位置预测和多模态情绪分析）上的有效性和可信度。

1 Introduction

在现实世界中有大量的多模态数据，我们从不同的模式[1]中体验世界。例如，自动驾驶系统通常配备了多个传感器，以从不同的角度收集信息[2]。在医学诊断[3]中，多模态数据通常来自不同类型的检查，通常包括各种临床数据。有效地利用不同来源的信息来提高学习性能是机器学习中一个长期存在和具有挑战性的目标。

[1] Tadas Baltruˇsaitis, Chaitanya Ahuja, and Louis-Philippe Morency. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence, 2018.

[2] Hyunggi Cho, Young-Woo Seo, BVK Vijaya Kumar, and Ragunathan Raj Rajkumar. A multi-sensor fusion system for moving object detection and tracking in urban driving environments. In ICRA, 2014.

[3] Richard J Perrin, Anne M Fagan, and David M Holtzman. Multimodal techniques for diagnosis and prognosis of alzheimer’s disease. Nature, 461(7266):916–922, 2009.

大多数多模态回归方法[4,5,6,7]通常侧重于通过利用多种模态之间的互补信息来提高回归性能。尽管有效，但由于缺乏可靠性和可解释性，将这些方法部署在成本敏感的应用程序中是相当危险的。一个潜在的缺陷是，传统模型通常假设每个模态的质量都基本是稳定的，这限制了它们产生可靠的预测，特别是当某些模态是有噪声的，甚至是损坏的[8,9]时。此外，现有的模型只输出（过于自信）预测[10,11]，这不能很好地支持安全决策，可能对安全关键应用可能是灾难性的。

[4] Quan Gan, Shangfei Wang, Longfei Hao, and Qiang Ji. A multimodal deep regression bayesian network for affective video content analyses. In ICCV, 2017.

[5] Guangnan Ye, Dong Liu, I-Hong Jhuo, and Shih-Fu Chang. Robust late fusion with rank minimization. In CVPR, 2012.

[6] Fabon Dzogang, Marie-Jeanne Lesot, Maria Rifqi, and Bernadette Bouchon-Meunier. Early fusion of low level features for emotion mining. Biomedical Informatics Insights, 2012.

[7] H. Gunes and M. Piccardi. Affect recognition from face and body: early fusion vs. late fusion. In IEEE International Conference on Systems, 2006.

[8] Paul Pu Liang, Zhun Liu, Yao-Hung Hubert Tsai, Qibin Zhao, Ruslan Salakhutdinov, and Louis-Philippe Morency. Learning representations from imperfect time series data via tensor rank regularization. arXiv:1907.01011, 2019.

[9] Michelle A. Lee, Matthew Tan, Yuke Zhu, and Jeannette Bohg. Detect, reject, correct: Cross-modal compensation of corrupted sensors. arXiv:2012.00201, 2020.

[10] Jasper Snoek, Yaniv Ovadia, Emily Fertig, Balaji Lakshminarayanan, Sebastian Nowozin, D. Sculley, Joshua V. Dillon, Jie Ren, and Zachary Nado. Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. In NeurIPS, 2019.

[11] Dennis Ulmer and Giovanni Cin`a. Know your limits: Uncertainty estimation with relu classifiers fails at reliable ood detection. arXiv:2012.05329, 2021.

不确定性估计为可信预测[12,13]提供了一种方法。没有不确定性估计的模型所做的决策是不可信的，因为它们容易受到噪声或有限的训练数据的影响。因此，描述基于人工智能的系统学习中的不确定性是非常可取的。更具体地说，当一个模型被给予一个从未见过或被严重污染的输入时，它应该能够表达“我不知道”。不可靠的模型很容易受到攻击，它也可能导致错误的决策，而在关键领域的[14]中，成本往往是难以忍受的。

[12] Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, Mohammad Ghavamzadeh, Paul Fieguth, Xiaochun Cao, Abbas Khosravi, U Rajendra Acharya, VladimirMakarenkov, and Saeid Nahavandi. A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges. arXiv:2011.06225, 2020.

[13] Zongbo Han, Changqing Zhang, Huazhu Fu, and Joey Tianyi Zhou. Trusted multi-view classification. In ICLR, 2021.

[14] Alex Kendall and Yarin Gal. What uncertainties do we need in bayesian deep learning for computer vision? In NeurIPS, 2017.

基本上，它可以通过动态建模不确定性，赋予模型可信度。因此，我们提出了一种新的以可靠的方式进行多模态回归的算法。具体来说，我们提出的算法是一个统一的框架建模不确定性下的完全概率框架。我们的模型通过引入一种正态-逆伽马分布的混合物（MoNIG）来集成了多种模式，它分层地表征了不确定性，并相应地提高了回归的准确性和可信度。

综上所述：

(1)我们提出了一种新的可信多模态回归算法。我们的方法在具有特定模式不确定性和全局不确定性的证据回归框架下有效地融合了多种模态。

(2)为了整合不同的模态，设计了一种新的MoNIG，以动态识别模式特定的噪声/腐败，这有望支持可信的决策，也显著提高了鲁棒性。

(3)我们对合成数据和实际应用数据进行了广泛的实验，验证了所提出的模型在不同的多模态回归任务（如超导性[15]的临界温度预测、CT切片的相对位置和人类多模态情绪分析）上的有效性、鲁棒性和可靠性。

[15] K. Hamidieh. A data-driven statistical model for predicting the critical temperature of a superconductor. Computational Materials Science, 154:346–354, 2018.

2 Related Work

2.1 Uncertainty Estimation 不确定性估计

量化机器学习模型的不确定性已经受到了广泛的关注[16,17]，特别是当系统部署在安全关键领域时，如自动车辆控制[18]和医疗诊断[3]。

[16] Danijar Hafner, Dustin Tran, Timothy P. Lillicrap, Alex Irpan, and James Davidson. Noise contrastive priors for functional uncertainty. In UAI, 2019.

[17] Kefaya Qaddoum and E. L. Hines. Reliable yield prediction with regression neural networks. In WSEAS international conference on systems theory and scientific computation, 2012.

[18] Alireza Khodayari, Ali Ghaffari, Sina Ameli, and Jamal Flahatgar. A historical review on lateral and longitudinal control of autonomous vehicle motions. In International Conference on Mechanical & Electrical Technology, 2010.

[3] Richard J Perrin, Anne M Fagan, and David M Holtzman. Multimodal techniques for diagnosis and prognosis of alzheimer’s disease. Nature, 461(7266):916–922, 2009.

Bayesian neural networks[19,20]通过在模型参数上放置一个分布并边缘化这些参数来形成一个预测分布来建模不确定性。由于现代神经网络的参数空间巨大，贝叶斯神经网络具有高度的非凸性和推理难度。

[19] Radford M Neal. Bayesian learning for neural networks. Springer Science & Business Media, 2012.

[20] David JC MacKay. Bayesian interpolation. Neural computation, 4(3):415–447, 1992.

为了解决这一问题，[21]将Variational Dropout[22]扩展到dropout无界的情况，并提出了一种减少梯度估计器方差的方法。一种更可伸缩的替代方法是MC Dropout [23]，它易于实现，并已成功应用于下游任务[14,24]。

[21] Dmitry Molchanov, Arsenii Ashukha, and Dmitry P. Vetrov. Variational dropout sparsifies deep neural networks. In ICML, 2017.

[22] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. Computer Science, 2012.

[23] Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In ICML, 2016.

[14] Alex Kendall and Yarin Gal. What uncertainties do we need in bayesian deep learning for computer vision? In NeurIPS, 2017.

[24] Jishnu Mukhoti and Yarin Gal. Evaluating bayesian deep learning methods for semantic segmentation. CoRR, 2018.

Deep ensembles[25]在分类精度和不确定性估计方面都显示出了强大的能力。据观察，深度集成始终优于使用variational inference[10]训练的贝叶斯神经网络。然而，内存和计算成本都相当高。

[25] Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. In NeurIPS, 2017.

对于这个问题，我们训练了具有共享参数的不同深度子网络进行集成[26]。Deterministic uncertainty methods旨在直接输出不确定度，缓解过度置信度。基于RBF网络，[27]能够识别分布外的样本。[28]引入了一个新的模型置信度目标标准，称为True Class Probability（TCP），以确保故障预测的低置信度。最近的方法采用focal loss来校准深度神经网络[29]。[30]将Dirichlet priors置于离散分类预测之上，并将散度正则化为一个定义明确的先验。我们的模型受到了为单模态数据设计的深度证据回归[31]的启发。

[26] Javier Antor´an, James Urquhart Allingham, and Jos´e Miguel Hern´andez-Lobato. Depth uncertainty in neural networks. In NeurIPS, 2020.

[27] Joost van Amersfoort, Lewis Smith, Yee Whye Teh, and Yarin Gal. Uncertainty estimation using a single deep deterministic neural network. In ICML, 2020.

[28] Charles Corbi`ere, Nicolas Thome, Avner Bar-Hen, Matthieu Cord, and Patrick P´erez. Addressing failure prediction by learning model confidence. In NeurIPS, 2019.

[29] Jishnu Mukhoti, Viveka Kulharia, Amartya Sanyal, Stuart Golodetz, Philip Torr, and Puneet Dokania. Calibrating deep neural networks using focal loss. In NeurIPS, 2020.

[30] Murat Sensoy, Lance M. Kaplan, and Melih Kandemir. Evidential deep learning to quantify classification uncertainty. In NeurIPS, 2018.

[31] Alexander Amini, Wilko Schwarting, Ava Soleimany, and Daniela Rus. Deep evidential regression. In NeurIPS, 2020.

2.2 Multimodal Learning 多模态学习

多模态机器学习的目标是建立可以联合利用来自多种模态[1,32]的信息的模型。

[1] Tadas Baltruˇsaitis, Chaitanya Ahuja, and Louis-Philippe Morency. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysi