概率统计笔记:用python实现贝叶斯回归

0 理论部分:

概率统计笔记:贝叶斯线性回归_UQI-LIUWJ的博客-CSDN博客

1 数据集部分

1.1 创建数据集

import matplotlib.pyplot as plt

a_true = 2
b_true = 1
tau_true = 1

n = 50
x = np.random.uniform(low = 0, high = 4, size = n)
y = np.random.normal(a_true * x + b_true, 1 / np.sqrt(tau_true))

x \sim U(0,4) (均匀分布)

y ~ N(2x+1,1) (正态分布)

1.2 数据集可视化

fig = plt.figure(figsize = (10, 8))
ax = plt.subplot(111)
plt.plot(x, y, "o", label =r"$x_i\sim U\left(0,4\right),y_i\sim\mathcal{N}\left(2x+1,1\right),\forall i$")
#输入数据
plt.xlabel('x')
plt.ylabel('y')
plt.plot([0, 4], [1, 9], label = r"$y_i=2x_i+1,i=1,2,...,n$")
#欲拟合数据
ax.legend()

 2 参数&超参数初始化:

maxiter = 1000
init = {"a": 0, "b": 0, "tau_epsilon": 2}
hypers = {"mu_1": 0, "tau_1": 1, "mu_2": 0, "tau_2": 1, "alpha": 2, "beta": 1}

 3 采样

3.1 采样a

(a|x_i,y_i) \sim N (\frac{\mu_1 \gamma_1 +\tau_\epsilon \sum_{i=1}^n (y_i-b)x_i}{\gamma_1+ \tau_\epsilon \sum_{i=1}^n x_i^2},(\gamma_1+ \tau_\epsilon \sum_{i=1}^n x_i^2)^{-1})

import numpy as np

def sample_para_a(x, y, b, tau_epsilon, mu_1, tau_1):
    n = len(y)
    
    tilde_tau_1 = tau_1 + tau_epsilon * np.sum(x * x)
    #γ1+τε Σx_i^2
    
    tilde_mu_1 = tau_1 * mu_1 + tau_epsilon * np.sum((y - b) * x)
    #μ1γ1+τε Σ(y_i-b)x_i
    
    tilde_mu_1 /= tilde_tau_1
    
    return np.random.normal(tilde_mu_1, 1./np.sqrt(tilde_tau_1))

3.2 采样b

(b|x_i,y_i) \sim N(\frac{\mu_2 \gamma_2+\tau_\epsilon \sum_{i=1}^n (y_i-ax_i)}{\gamma_2+n \tau_\epsilon},(\gamma_2+n \tau_\epsilon)^{-1})

def sample_para_b(x, y, a, tau_epsilon, mu_2, tau_2):
    n = len(y)

    tilde_tau_2 = tau_2 + n * tau_epsilon
    #γ2+nτε
    
    tilde_mu_2 = tau_2 * mu_2 + tau_epsilon * np.sum(y - a * x)
    #μ2γ2+τε Σ(y_i-ax_i)
    
    tilde_mu_2 /= tilde_tau_2
    return np.random.normal(tilde_mu_2, 1./np.sqrt(tilde_tau_2))

3.3 采样 τε

(\tau_\epsilon|x_i,y_i) \sim Gamma(\alpha+\frac{n}{2},\beta+\frac{\sum_{i=1}^n(x_i-\mu)^2}{2})

def sample_tau_epsilon(x, y, a, b, alpha, beta):
    n = len(y)
    tilde_alpha = alpha + n / 2
    #α+n/2 
    
    tilde_beta = beta + np.sum((y - a * x - b) * (y - a * x - b)) / 2
    #β+(Σ(y_i-ax_i-b)^2)/2
    
    return np.random.gamma(tilde_alpha, 1 / tilde_beta)

3.4 吉布斯采样

MCMC笔记:吉布斯采样(Gibbs)_UQI-LIUWJ的博客-CSDN博客

import pandas as pd

def gibbs(x, y, maxiter, init, hypers):

    a = init["a"]
    b = init["b"]
    tau_epsilon = init["tau_epsilon"]

    trace = np.zeros((maxiter+1, 3))
    trace[0:]=np.array((a, b, tau_epsilon))
    
    for iter in range(maxiter):
        a = sample_para_a(x, y, b, tau_epsilon, hypers["mu_1"], hypers["tau_1"])
        #用b_(t-1)  τε_(t-1) 采样 a_t
        
        b = sample_para_b(x, y, a, tau_epsilon, hypers["mu_2"], hypers["tau_2"])
        #用a_t τε_(t-1) 采样 b_t
        
        tau_epsilon = sample_tau_epsilon(x, y, a, b, hypers["alpha"], hypers["beta"])
        #用a_t b_t 采样 τε_t
        
        trace[iter+1, :] = np.array((a, b, tau_epsilon))
        #保存这一次迭代的信息
        
    trace = pd.DataFrame(trace)
    trace.columns = ['a', 'b', 'tau_epsilon']
    return trace
#返回一个pd  DataFrame

4 训练+可视化trace

 

trace = gibbs(x, y, maxiter, init, hypers)


#下面是可视化部分
fig = plt.figure(figsize = (10, 6))
plt.plot(trace['a'], label = r"$a$")
plt.plot(trace['b'], label = r"$b$")
plt.plot(trace['tau_epsilon'], label = r"$\tau_{\epsilon}$")
plt.xlabel("Iterations")
plt.ylabel("Parameter Value")
plt.legend()

 参考资料:浅谈贝叶斯张量分解(二):简单的贝叶斯线性回归模型 - 知乎 (zhihu.com)

python写的一段贝叶斯网络的程序 This file describes a Bayes Net Toolkit that we will refer to now as BNT. This version is 0.1. Let's consider this code an "alpha" version that contains some useful functionality, but is not complete, and is not a ready-to-use "application". The purpose of the toolkit is to facilitate creating experimental Bayes nets that analyze sequences of events. The toolkit provides code to help with the following: (a) creating Bayes nets. There are three classes of nodes defined, and to construct a Bayes net, you can write code that calls the constructors of these classes, and then you can create links among them. (b) displaying Bayes nets. There is code to create new windows and to draw Bayes nets in them. This includes drawing the nodes, the arcs, the labels, and various properties of nodes. (c) propagating a-posteriori probabilities. When one node's probability changes, the posterior probabilities of nodes downstream from it may need to change, too, depending on firing thresholds, etc. There is code in the toolkit to support that. (d) simulating events ("playing" event sequences) and having the Bayes net respond to them. This functionality is split over several files. Here are the files and the functionality that they represent. BayesNetNode.py: class definition for the basic node in a Bayes net. BayesUpdating.py: computing the a-posteriori probability of a node given the probabilities of its parents. InputNode.py: class definition for "input nodes". InputNode is a subclass of BayesNetNode. Input nodes have special features that allow them to recognize evidence items (using regular-expression pattern matching of the string descriptions of events). OutputNode.py: class definition for "output nodes". OutputBode is a subclass of BayesNetNode. An output node can have a list of actions to be performed when the node's posterior probability exceeds a threshold ReadWriteSigmaFiles.py: Functionality for loading and saving Bayes nets in an XML format. SampleNets.py: Some code that constructs a sample Bayes net. This is called when SIGMAEditor.py is started up. SIGMAEditor.py: A main program that can be turned into an experimental application by adding menus, more code, etc. It has some facilities already for loading event sequence files and playing them. sample-event-file.txt: A sequence of events that exemplifies the format for these events. gma-mona.igm: A sample Bayes net in the form of an XML file. The SIGMAEditor program can read this type of file. Here are some limitations of the toolkit as of 23 February 2009: 1. Users cannot yet edit Bayes nets directly in the SIGMAEditor. Code has to be written to create new Bayes nets, at this time. 2. If you select the File menu's option to load a new Bayes net file, you get a fixed example: gma-mona.igm. This should be changed in the future to bring up a file dialog box so that the user can select the file. 3. When you "run" an event sequence in the SIGMAEditor, the program will present each event to each input node and find out if the input node's filter matches the evidence. If it does match, that fact is printed to standard output, but nothing else is done. What should then happen is that the node's probability is updated according to its response method, and if the new probability exceeds the node's threshold, then its successor ("children") get their probabilities updated, too. 4. No animation of the Bayes net is performed when an event sequence is run. Ideally, the diagram would be updated dynamically to show the activity, especially when posterior probabilities of nodes change and thresholds are exceeded. To use the BNT, do three kinds of development: A. create your own Bayes net whose input nodes correspond to pieces of evidence that might be presented and that might be relevant to drawing inferences about what's going on in the situation or process that you are analyzing. You do this by writing Python code that calls constructors etc. See the example in SampleNets.py. B. create a sample event stream that represents a plausible sequence of events that your system should be able to analyze. Put this in a file in the same format as used in sample-event-sequence.txt. C. modify the code of BNT or add new modules as necessary to obtain the functionality you want in your system. This could include code to perform actions whenever an output node's threshold is exceeded. It could include code to generate events (rather than read them from a file). And it could include code to describe more clearly what is going on whenever a node's probability is updated (e.g., what the significance of the update is -- more certainty about something, an indication that the weight of evidence is becoming strong, etc.)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UQI-LIUWJ

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值