Gibbs采样

最新推荐文章于 2023-12-10 14:31:55 发布

Chen_SL

最新推荐文章于 2023-12-10 14:31:55 发布

阅读量1.1k

点赞数 2

分类专栏：优化算法文章标签：概率图模型

本文链接：https://blog.csdn.net/chenshulong/article/details/78928829

版权

本文详细介绍了Gibbs采样算法，作为扩展Metropolis-Hastings算法到多元分布的一种方法。讨论了Blockwise updating和Componentwise updating两种策略，以及Gibbs采样的优势，特别是其在条件分布已知时的有效性。Gibbs采样通过逐个更新随机变量，避免了选择建议分布的问题，提高了采样效率。

摘要由CSDN通过智能技术生成

背景

之前介绍了Metropolis-Hastings采样算法，只考虑了一元概率分布的样本生成问题。本文讨论如何将MH算法扩展到多元分布上，重点讨论Gibbs采样算法。

Blockwise updating

Blockwise updating 是将MH扩展到多元分布上最简单的方法，就是选择一个和目标分布有相同维度的建议分布。例如如果要生成 $N$ 元概率分布的样本，我们就采用一个 $N$ 元的建议分布（proposal distribution）。在马尔可夫链状态转移过程中，接受或者拒绝整个建议（包含 $N$ 个随机变量）。算法的流程如下所示：

设置马尔可夫链的初始状态 $X_{0}=\vec x_{0}$

对 $t=0,1,2,3,\ldots$ 循环进行以下过程

(1) 第 $t$ 时刻的马氏链的状态为 $X_{t} = \vec x_{t}$ ，采样 $\vec y \sim q(\vec y|\vec x_{t})$

(2) 从均匀分布采样 $u \sim Uniform[0,1]$

(3) 如果 $u < \alpha(\vec x_{t},\vec y)=min[\frac{p(\vec y)q(\vec x_{t}|\vec y)}{p(\vec x_{t})q(\vec y|\vec x_{t})},1]$ ，则接受转移 $\vec x_{t} \rightarrow \vec y$ ， $X_{t+1}=\vec y$

(4) 否则不接受转移， $X_{t+1}=\vec x_{t}$

与之前MH算法的不同之处，就是将标量 $x$ 替换成了 $\vec x$ 。其中 $\vec x=\{x_{1},x_{2},\ldots,x_{N}\}$ 表示 $N$ 维随机变量。

例子

采用 Blockwise updating 方法对 Bivariate Exponential 分布进行采样。概率密度函数如下：

function y = bivexp(theta1,theta2)
%%返回Bivariate Exponential分布的概率密度函数
lambda1 = 0.5;
lambda2 = 0.1;
lambda = 0.01;
maxval = 8;
y = exp(-(lambda1+lambda)*theta1-(lambda2+lambda)*theta2-lambda*maxval);

采样过程代码如下：

%% Blockwise updating to sample from Bivariate Exponential

%% Initalize the MH sampler
T=10000; % Set the maximum number of iterations
x_min = [ 0 0 ]; % define minimum for theta1 and theta2
x_max = [ 8 8 ]; % define maximum for theta1 and theta2
seed=1; rand('state', seed ); randn('state',seed ); % set the random seed
x = zeros( 2 , T ); % Init storage space for our samples
% Use a uniform proposal distribution
x(1,1) = unifrnd( x_min(1) , x_max(1) ); % Start value for theta1
x(2,1) = unifrnd( x_min(2) , x_max(2) ); % Start value for theta2

%% Start sampling 
t=1;
while t < T % Iterate until we have T samples
    t = t + 1;
    % Propose a new value for theta
    y = unifrnd(thetamin,thetamax);
    pratio = bivexp(y(1),y(2))/bivexp(x(1,t-1),x(2,t-1));
    alpha = min([1,pratio]); % Calculate the acceptance ratio
    u = rand; % Draw a uniform deviate from [ 0 1 ]
    if u < alpha % Do we accept this proposal?
        x(:,t) = y; % proposal becomes new value for theta
    else
        x(:,t) = x(:,t-1); % copy old value of theta
    end
end

%% Display histogram of our samples
figure( 1 ); clf;
subplot( 1,2,1 );
nbins = 10;
xbins1 = linspace( x_min(1) , x_max(1) , nbins );
xbins2 = linspace( x_min(2) , x_max(2) , nbins );
hist3( x' , 'Edges' , {xbins1 xbins2} );
xlabel( 'x_1' ); ylabel('x_2' ); zlabel( 'counts' );
az = 61; el = 30;
view(az, el);

%% Plot the theoretical density
subplot(1,2,2);
nbins = 20;
xbins1 = linspace( x_min(1) , x_max(1) , nbins );
xbins2 = linspace( x_min(2) , x_max(2) , nbins );
[ x1grid , x2grid ] = meshgrid( xbins1 , xbins2 );
ygrid = bivexp( x1grid , x2grid );
mesh( x1grid , x2grid , ygrid );
xlabel( 'x_1' ); ylabel('x_2' );
zlabel( 'f(x_1,x_2)' );
view(az, el);

实验结果如下：
这里写图片描述

Componentwise updating

MH算法中，选择合适的建议分布是比较困难的，对于高维分布更是如此。前面介绍的 Blockwise updating的方法，拒绝率往往很高，导致算法的效率不高。下面介绍一种与之相对应的采样方法：Componentwise updating。与 Blockwise 提出包含 $N$ 个元素的建议然后接受或拒绝整个建议 $\vec y$ 不同，Componentwise 每次针对 $x_{t}$ 的第 $i$ 个元素提出建议 $y_{i}$ ，然后接受或拒绝这个建议。