论文题目:Mutual Information Neural Estimation
Summary
- 通过在神经网络上的梯度下降,可以估计高维连续随机变量之间的相互信息。论文通过构造互信息计算的一个下界,利用该下界来得到互信息的估计量,并证明了该下界的紧致性。
Problem Statement
- 现有的很多深度学习问题中都涉及到互信息,但是互信息的估计却难以计算,传统的方法都死通用性不强,因此需要有泛化能力强的方法来计算互信息。
Method
- 互信息计算等价于联合分布和两个边缘分布的乘积之间的KL散度,并给出下界近似:
Evaluation
- 将估计量与互信息真实值进行对比
- 将互信息估计方法用于生成式对抗网络
- 将互信息估计方法用于信息瓶颈
Conclusion
- 使用互信息神经估计方法得到的互信息估计量是最接近真实的互信息量的。
Notes
- Mutual information quantifies the dependence of two random variables X and Z.
- Despite being a pivotal quantity across data science, mutual information has historically been difficult to compute.
- Generative adversarial networks (GANs) train a generative model without any explicit assumptions about the underlying distribution of the data.
References
- Information bottleneck for gaussian variables.
- Fast and accurate deep network learning by exponential linear units (elus).
- A. Efficient estimation of mutual information for strongly dependent variables.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift.
- Equitability, mutual information, and the maximal information coefficient.
- Information theory and statistics.
- Input feature selection by mutual information based on parzen window.
- Deep learning and the information bottleneck principle.