Towards Bayesian Deep Learning: A Survey(贝叶斯深度学习)

what is Bayesian deep learning ?

1 : To achieve integrated intelligence that involves both perception and inference, it is naturally desirable to tightly integrate deep learning and bayesian models within a principled probabilistic framework, which we call Bayesian deep learning .

More important ability is thinking in AI system, and BDL can solve this problem

2 : However, in order to build a real AI system, simply being able to see, read, and hear is far from enough. It should, above all, possess the ability of thinking.
3 : it is the thinking part that defines a doctor. Specifically, the ability of thinking here could involve causal inference, logic deduction, and dealing with uncertainty, which is apparently beyond the capability of conventional deep learning methods. Fortunately, another type of models, probabilistic graphical models (PGM), excels at causal inference and dealing with uncertainty. The problem is that PGM is not as good as deep learning models at perception tasks. To address the problem, it is, therefore, a natural choice to tightly integrate deep learning and PGM within a principled probabilistic framework, which we call Bayesian deep learning (BDL) in this paper.

What dose the Bayesian deep learning do?

4 : With the tight and principled integration in Bayesian deep learning, the perception task and inference task are regarded as a whole and can benefit from each other. In the example above, being able to see the medical image could help with the doctor’s diagnosis and inference. On the other hand, diagnosis and inference can in return help with understanding the medical image. Suppose the doctor may not be sure about what a dark spot in a medical image is, but if she is able to infer the etiology of the symptoms and disease, it can help him better decide whether the dark spot is a tumor or not.

Other application

5 : Besides recommender systems, the need for Bayesian deep learning may also arise when we are dealing with control of non-linear dynamical systems with raw images as input.Consider controlling a complex dynamical system according to the live video stream received from a camera. This problem can be transformed into iteratively performing two tasks, perception from raw images and control based on dynamic models. The perception task can be taken care of using multiple layers of simple nonlinear transformation (deep learning) while the control task usually needs more sophisticated models like hidden Markov models and Kalman filters,The feedback loop is then completed by the fact that actions chosen by the control model can affect the received video stream in return. To enable an effective iterative process between the perception task and the control task, we need two-way information exchange between them. The perception component would be the basis on which the control component estimates its states and the control component with a dynamic model built in would be able to predict the future trajectory (images). In such cases, Bayesian deep learning is a suitable choice

The benefits of BDL

6 : Apart from the major advantage that BDL provides a principled way of unifying deep learning and PGM, another benefit comes from the implicit regularization built in BDL. By imposing a prior on hidden units, parameters defining a neural network, or the model parameters specifying the causal inference(因果推理), BDL can to some degree avoid overfitting, especially when we do not have sufficient data.

What does the BDL consists of

7 : Usually, a BDL model consists of two components, a perception component that is a Bayesian formulation of a certain type of neural networks(是一种特定类型的神经网络的贝叶斯公式) and a task-specific component that describes the relationship among different
hidden or observed variables using PGM. Regularization is crucial for them both. Neural networks usually have large numbers of free parameters that need to be regularized properly. Regularization techniques like weight decay and dropout are shown to be effective in improving performance of neural networks and they both have Bayesian interpretations. In terms of the task-specific component, expert knowledge or prior information, as a kind of regularization, can be incorporated into the model through the prior we imposed to guide the model when data are scarce.

Yet another advantage of using BDL for complex tasks (tasks that need both perception and inference) is that it provides a principled Bayesian approach of handling parameter uncertainty. When BDL is applied to complex tasks, there are three kinds of parameter uncertainty that need to be taken into account:

  1. Uncertainty on the neural network parameters.
  2. Uncertainty on the task-specific(特定任务) parameters.
  3. Uncertainty of exchanging information between the perception component and the task-specific component.
    By representing the unknown parameters using distributions instead of point estimates(??), BDL offers a promising framework to handle these three kinds of uncertainty in a unified way.It is worth noting that the third uncertainty could only be handled under a unified framework like BDL. If we train the perception component and the task-specific component separately, it is equivalent to assuming no uncertainty when exchanging information between the two components.

The bottlenecks of BDL

Of course, there are challenges when applying BDL to real-world tasks. (1) First, it is nontrivial(不容易的) to design an efficient Bayesian formulation of neural networks with reasonable time complexity. This line of work is pioneered by [24], [37], [40], but it has not been widely adopted due to its lack of scalability(可扩展性). Fortunately, some recent advances in this direction [1], [7], [19], [22], [32] seem to shed light on the practical adoption of Bayesian neural network. (2) The second challenge is to ensure efficient and effective information exchange between the perception component and the task-specific component. Ideally both the first-order and second-order information (e.g., the mean and the variance) should be able to flow back and forth between the two components. A natural way is to represent the perception component as a PGM and seamlessly connect it to the task-specific PGM, as done in [15], [59], [60]

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
针对过分分布的普遍化:一项调查 "towards out of distribution generalization: a survey"是一项对过分分布普遍化现象的研究。该研究关注如何处理机器学习中的模型在训练过程中未曾遇到的情况下的泛化能力。 当前,机器学习中的模型往往在面对与训练数据不同的情况时出现问题。这些情况被称为"分布外"或"过分分布"。过分分布问题在现实世界的应用中非常普遍,例如在医学影像诊断中,模型在对未见过的病例进行预测时可能出现错误。 为了改善过分分布问题,该调查着重研究了几种处理方法。首先,一种方法是使用生成对抗网络(GAN)。GAN可以通过学习未见过的数据分布来生成合成样本,从而提高模型的泛化性能。其次,该调查还介绍了自监督学习和深度对比学习等技术。这些方法通过引入自动生成标签或学习新的特征表示来增强模型的泛化能力。 此外,该调查提到了一些用于评估模型在过分分布上泛化能力的评估指标。例如,置信度和不确定性度量可以帮助评估模型对于不同类别或未知样本的预测是否可信。同时,模型的置换不变性和鲁棒性也是评估模型泛化能力的重要因素。 总结来说,这项调查对于解决过分分布普遍化问题提供了一些有益的方法和指导。通过使用生成对抗网络、自监督学习和深度对比学习技术,以及评估模型的不确定性和鲁棒性,我们可以提高模型在未曾遇到的情况下的泛化能力。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值