pyro Bayesian Neural Networks HiddenLayer 是贝叶斯神经网络中的基本构件。它表示单个隐藏层

最新推荐文章于 2024-10-08 13:48:11 发布

zhangfeng1133

最新推荐文章于 2024-10-08 13:48:11 发布

阅读量902

点赞数 11

文章标签：深度学习人工智能

本文链接：https://blog.csdn.net/zhangfeng1133/article/details/141979744

版权

class HiddenLayer(X=None, A_mean=None, A_scale=None, non_linearity=<function relu>, KL_factor=1.0, A_prior_scale=1.0, include_hidden_bias=True, weight_space_sampling=False)[source]¶

A ~ Normal(A_mean, A_scale) output ~ non_linearity(AX)

Parameters

X (torch.Tensor) – B x D dimensional mini-batch of inputs
A_mean (torch.Tensor) – D x H dimensional specifiying weight mean
A_scale (torch.Tensor) – D x H dimensional (diagonal covariance matrix) specifying weight uncertainty
non_linearity (callable) – a callable that specifies the non-linearity used. defaults to ReLU.
KL_factor (float) – scaling factor for the KL divergence. prototypically this is equal to the size of the mini-batch divided by the size of the whole dataset. defaults to 1.0.
A_prior (float or torch.Tensor) – the prior over the weights is assumed to be normal with mean zero and scale factor A_prior. default value is 1.0.
include_hidden_bias (bool) – controls whether the activations should be augmented with a 1, which can be used to incorporate bias terms. defaults to True.
weight_space_sampling (bool) – controls whether the local reparameterization trick is used. this is only intended to be used for internal testing. defaults to False.

这种分布是贝叶斯神经网络中的基本构件。它表示单个隐藏层，即应用于一组输入的仿射变换X随后是非线性。权重中的不确定性被编码在由参数指定的正态变分分布中a _比例尺和平均水平。所谓的“局部重新参数化技巧”用于减少方差(见下文参考)。实际上，这意味着权重永远不会被直接采样；取而代之的是在预激活空间中采样(即在应用非线性之前)。由于权重从不被直接采样，当在变分推断的上下文中使用该分布时，必须注意正确缩放对应于权重矩阵的KL散度项。这一术语被归入log_prob这种分布的方法。

实际上，这种分布编码了以下生成过程:

正常(平均值，刻度)输出非线性(AX)

因素