MATLAB 抽取随机数 MCMC原理

最新推荐文章于 2025-02-22 12:25:18 发布

orchidzouqr

最新推荐文章于 2025-02-22 12:25:18 发布

阅读量1.2w

点赞数 8

分类专栏：统计 matlab 文章标签：随机数-MCMC

本文链接：https://blog.csdn.net/orchidzouqr/article/details/60753151

版权

统计同时被 2 个专栏收录

7 篇文章

订阅专栏

matlab

6 篇文章

订阅专栏

1、matlab自带抽取随机数的函数
注：只列举各个函数名字，具体各个函数的用法可用help查看。
(1)正态分布随机数：randn()，normrnd(), mvnrnd(); 其中最后一个用于抽取联合正态分布的随机数。
(2)均匀分布随机数：rand()
(3)beta分布随机数: betarnd() - Beta random numbers.
(4)二项分布随机数：binornd() -Binomial random numbers.
(5)卡方分布随机数：chi2rnd() -Chi square random numbers.
(6)指数分布随机数：exprnd() -Exponential random numbers.
(7)极值分布随机数：evrnd() - Extreme value random numbers
frnd - F random numbers.
gamrnd - Gamma random numbers.
geornd - Geometric random numbers.
gevrnd - Generalized extreme value random numbers.
gprnd - Generalized Pareto inverse random numbers.
hygernd - Hypergeometric random numbers.
iwishrnd - Inverse Wishart random matrix.
johnsrnd - Random numbers from the Johnson system of distributions.
lognrnd - Lognormal random numbers.
mhsample - Metropolis-Hastings algorithm. 可用mhsample()抽取马尔科夫链，即MCMC抽样可采用这个函数。
mnrnd - Multinomial random vectors.
mvnrnd - Multivariate normal random vectors.
mvtrnd - Multivariate t random vectors.
nbinrnd - Negative binomial random numbers.
ncfrnd - Noncentral F random numbers.
nctrnd - Noncentral t random numbers.
ncx2rnd - Noncentral Chi-square random numbers.
normrnd - Normal (Gaussian) random numbers.
pearsrnd - Random numbers from the Pearson system of distributions.
poissrnd - Poisson random numbers.
randg - Gamma random numbers (unit scale).
random - Random numbers from specified distribution.
randsample - Random sample from finite population.
raylrnd - Rayleigh random numbers.
slicesample - Slice sampling method. (MCMC中的切片抽样方法)
trnd - T random numbers.
unidrnd - Discrete uniform random numbers.
unifrnd - Uniform random numbers.
wblrnd - Weibull random numbers.
wishrnd - Wishart random matrix.
参考文献：[MATLAB中统计分析函数]http://wenku.baidu.com/link?url=fxtUOBzUiRwhPl0JD1H8gt_1Gce_YqTxAYWct-G_pehbkRIZYKTVo508rCKHi1OGvqq3M6QYSyRx43hZ5QCG3zSofx80o2wxLxzcfWsJcq7
2、MCMC原理
主要讨论两种形式的MCMC:Metropolis-Hastings 和Gibbs抽样。
先理解MCMC中的两种思想：Monte Carlo 积分和Markov chains。
一、Monte Carlo Integration

概率统计推断中许多问题需要计算复杂的积分或者在大的结果空间内求和。如计算函数 $g(x)$ 的期望，其中 $x$ 是随机变量，密度函数为 $p(x)$ ，如果 $x$ 是连续随机变量，则

E [g (x)] = \int g (x) p (x) d x

$\begin{equation} E[g(x)]=\int g(x)p(x)dx \end{equation}$

若 $x$ 是离散随机变量，则

E [g (x)] = \sum g (x) p (x)

$E[g(x)]=\sum g(x)p(x)$

The general idea of Monte Carlo integration is to use samples to approximate the expectation of a complex distribution.（蒙特卡洛积分的一般思想是用抽样的样本矩近似复杂分布的期望）

$x^{(t)},t=1,2,...,N$ 是从分布 $p(x)$ 抽取的独立样本，因此，我们可用有限项求和近似上述积分：

E [g (x)] = 1 n \sum i = 1 n g (x (t))

$\begin{equation} E[g(x)]=\frac{1}{n}\sum\limits_{i=1}^{n} g(x^{(t)}) \end{equation}$

一般来说，随着增加抽样量 $n$ ，近似精度越来越高。Crucially，近似精度还依赖于样本的相关性。当样本是相关的，有效样本规模减小。(When the samples are correlated, the effective sample size decreases. This is not an issue with the rejection sampler but a potential problem with MCMC approaches. 我是这么理解的，对与Metropolis-Hastings算法来讲，相关性不是问题，因为采用了rejection策略，而对于Gibbs抽样，需要注意相关性问题，因为在Gibbs抽样中抽得的样本全留下。)

二、Markov chains
A markov chain is a stochastic process where we transition from one state to another state using a simple sequential procedure.设起始状态为 $x^{(1)}$ ,转移函数为 $p(x^{(t)}|x^{(t-1)})$ (to determine the next state, $x^{(2)}$ conditional on the last state.) We then keep iterating to create a sequence of states:

x (1) \to x (2) \to \dots \to x (t) \to

$x^{(1)}\rightarrow x^{(2)} \rightarrow \cdots \rightarrow x^{(t)} \rightarrow$

产生T个状态的Markov链的步骤如下：
1. Set $t=1$
2. Generate a initial value $u$ , and set $x^{(t)}=u$ .
3. Repeat
　　　t=t+1
　　　sample a new value $u$ from the transition function $p(x^{(t)}|x^{(t-1)})$
　　　set $x^{(t)}=u$
4. Until $t=T$ .

下面重点介绍MCMC,讨论三种方法Metropolis，Metropolis-Hasting，Gibbs sampling。
MCMC关键的两个分布是target distribution和proposal distribution。MCMC的目的就是抽target distribution的样本。

Metropolis算法
　　Metropolis是MCMC所有方法中最简单的，是Metropolis-Hastings的一种特殊情形，proposal分布需要对称（ $q(\theta^{(t)}|\theta^{(t-1)})=q(\theta^{(t-1)}|\theta^{(t)})$ ）。
　　算法步骤：
　　1. Set $t=1$
　　2. Generate a initial value $u$ , and set $\theta ^{(t)}=u$ .
　　3. Repeat
　　　　　 $t=t+1$
　　　　　Generate a proposal $\theta^{*}$ from $q(\theta|\theta^{(t-1)})$
　　　　　Evaluate the acceptance probability $\alpha=min(1,\frac{p(\theta ^*)}{p(\theta^{(t-1)})})$
　　　　　Generate a $u$ from a Uniform(0,1) distribution
　　　　　If $u\leq \alpha$ , accept the proposal and set $\theta ^{(t)}=\theta ^*$ ,else set $\theta^{(t)}=\theta^{(t-1)}$ .
　　4.Until $t=T$ . 　　
注意给定的proposal distribution 实际上是个条件分布。从接受率公式可看出，target distribution可以是unnormalized。

Metropolis-Hastings算法
Metropolis-Hastings算法(MH)是Metropolis算法的generalized version。算法步骤一样，但是接受率需改为

α = m i n (1, p ( θ * ) p ( θ ( t - 1 ) ) q ( θ ( t - 1 ) | θ * ) q ( θ * | θ ( t - 1 ) ))

$\alpha=min(1,\frac{p(\theta^*)}{p(\theta^{(t-1)})} \frac{q(\theta^{(t-1)}|\theta^*)}{q(\theta^*|\theta^{(t-1)})})$

　　proposal distribution的选取原则
　　可以看出，在Metropolis算法和MH算法中，proposal distribution起到和很重要的作用。proposal distribution原则上可以任意选择，常见有两种简单方式，一种是随机游动链，新值 $y$ 为现在值 $x$ 加上一随机变量 $z$ ，即 $y=x+z$ ,此时， $q_{y_{t+1}|y_t}(y|x)=q(y-x)$ ,其中 $q$ 为任一概率密度。另一种称为独立链，新值 $y$ 与现在值 $x$ 无关,即 $q_{y_{t+1}|y_t}(y|x)=q(y)$ ,其中 $g$ 为任一概率密度。对于有界的随机变量，注意应该建立合适的proposal distribution。Generally，一个好的rule是to use a proposal distribution has positive density on the same support as the target distribution. For example, if the target distribution has support over $0\leq\theta<\infty$ ,the proposal distribution should have the same support.　

MH 用于多元抽样
　　两种策略，blockwise updating和componentwise updating. 本文重点介绍后一种。因为对第一种寻找合适的高维proposal 分布比较难。另一个是拒绝率往往会很高。
　　下面是两维componentwise MH sampler steps：
　　1. set $t=1$ .
　　2. Generate an initial value $u=(u_1,u_2,...,u_N),$ and set $\theta^{(t)}=u$
　　3 Repeat
　　　　 $t=t+1$
　　　　Generate a proposal $\theta_1^*$ from $q(\theta_1|\theta_1^{(t-1)})$
　　Evaluate the acceptance probability $\alpha=min(1,\frac{p(\theta_1^*,\theta_2^{(t-1)})}{p(\theta_1^{(t-1)},\theta_2^{(t-1)})} \frac{q(\theta_1^{(t-1)}|\theta_1^*)}{q(\theta_1^*|\theta_1^{(t-1)})})$
　　　　Generate a $u$ from a Uniform(0,1) distribution
　　　　If $u\leq \alpha$ , accept the proposal and set $\theta_1^{(t)}=\theta_1^*$ ,else set $\theta_1^{(t)}=\theta_1^{(t-1)}$ .
　　　　Generate a proposal $\theta_2^*$ from $q(\theta_2|\theta_2^{(t-1)})$
　　　Evaluate the acceptance probability $\alpha=min(1,\frac{p(\theta_1^{(t)},\theta_2^*)}{p(\theta_1^{(t)},\theta_2^{(t-1)})}\frac{q(\theta_2^{(t-1)}|\theta_2^*}{q(\theta_2^*|\theta_2^{(t-1)})})$
　　　　Generate a $u$ from a Uniform(0,1) distribution
　　　　If $u\leq \alpha$ ,accept the proposal and set $\theta_2^{(t)}=\theta_2^*$ ,else set $\theta_2^{(t)}=\theta_2^{(t-1)}$ .
　　4. Until $t=T$ .
　　
Gibbs sampling
　在Gibbs抽样中，没有rejecttion，因此提高了计算效率。另一个优势是没必要去寻找合适的proposal distribution。但是我们需要知道多元分布的条件分布，即the Gibbs sampler can only be applied in situations where we know the full conditional distributions of each component in the multivariate distribution conditioned on all other components.
　二元情况的Gibbs sampling步骤：
　1. set $t=1$ .
　2.Generate an initial value $u=(u_1,u_2)$ and set $\theta^{(t)}=u$ .
　3. Repeat
　　　 $t=t+1$
　　　Sample $\theta_1^{(t)}$ from the conditional distribution $f(\theta_1|\theta_2=\theta_2^{(t-1)})$
　　　Sample $\theta_2^{(t)}$ from the conditional distribution $f(\theta_2|\theta_1=\theta_1^{(t)})$
　4. Until $t=T$ .
　
参考文献：[Computational statistics with matlab]http://psiexp.ss.uci.edu/research/teachingP205C/205C.pdf