概率编程编程_概率编程语言的温和介绍

最新推荐文章于 2024-08-21 14:16:52 发布

weixin_26752765

最新推荐文章于 2024-08-21 14:16:52 发布

阅读量1.2k

点赞数

文章标签： python java 算法机器学习人工智能

原文链接：https://medium.com/swlh/a-gentle-introduction-to-probabilistic-programming-languages-bf1e19042ab6

版权

概率编程编程

I recently started a new newsletter focus on AI education. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:

我最近开始了一份有关AI教育的新时事通讯。 TheSequence是无BS(意味着没有炒作，没有新闻等)，它是专注于AI的新闻通讯，需要5分钟的阅读时间。目标是让您了解机器学习项目，研究论文和概念的最新动态。请通过以下订阅尝试一下：

Probabilistic thinking is an incredibly valuable tool for decision making. From economists to poker players, people that can think in terms of probabilities tend to make better decisions when faced with uncertain situations. The fields of probabilities and game theory have been established for centuries and decades but are not experiencing a renaissance with the rapid evolution of artificial intelligence(AI). Can we incorporate probabilities as a first class citizen of software code? Welcome to the world of probabilistic programming languages(PPLs)

概率思维是决策中极为宝贵的工具。从经济学家到扑克玩家，那些会考虑概率的人在遇到不确定的情况时往往会做出更好的决策。概率和博弈论领域已经建立了几个世纪和几十年，但随着人工智能(AI)的快速发展，它并没有经历复兴。我们可以将概率作为软件代码的一等公民纳入其中吗？欢迎来到概率编程语言(PPL)的世界

The use of statistics to overcome uncertainty is one of the pillars of a large segment of the machine learning market. Probabilistic reasoning has long been considered one of the foundations of inference algorithms and is represented is all major machine learning frameworks and platforms. Recently, probabilistic reasoning has seen major adoption within tech giants like Uber, Facebook or Microsoft helping to push the research and technological agenda in the space. Specifically, PPLs have become one of the most active areas of development in machine learning sparking the release of some new and exciting technologies.

使用统计数据克服不确定性是机器学习市场很大一部分的Struts之一。长期以来，概率推理一直被认为是推理算法的基础之一，并且代表了所有主要的机器学习框架和平台。最近，概率推理已在Uber，Facebook或Microsoft等技术巨头中得到广泛采用，有助于推动该领域的研究和技术议程。具体地说，PPL成为机器学习中最活跃的发展领域之一，从而激发了一些令人兴奋的新技术的发布。

什么是概率编程语言？ (What are Probabilistic Programming Languages?)

Conceptually, probabilistic programming languages(PPLs) are domain-specific languages that describe probabilistic models and the mechanics to perform inference in those models. The magic of PPL relies on combining the inference capabilities of probabilistic methods with the representational power of programming languages.

从概念上讲，概率编程语言(PPL)是领域特定的语言，描述了概率模型以及在这些模型中进行推理的机制。 PPL的魔力在于将概率方法的推理能力与编程语言的表示能力相结合。

In a PPL program, assumptions are encoded with prior distributions over the variables of the model. During execution, a PPL program will launch an inference procedure to automatically compute the posterior distributions of the parameters of the model based on observed data. In other words, inference adjusts the prior distribution using the observed data to give a more precise mode. The output of a PPL program is a probability distribution, which allows the programmer to explicitly visualize and manipulate the uncertainty associated with a result.

在PPL程序中，假设使用模型变量的先验分布进行编码。在执行期间，PPL程序将启动一个推理过程，以根据观察到的数据自动计算模型参数的后验分布。换句话说，推论使用观察到的数据来调整先验分布以给出更精确的模式。 PPL程序的输出是概率分布，它使程序员可以显式可视化和操纵与结果相关的不确定性。

To illustrate the simplicity of PPLs, let’s use one of the most famous problems of modern statistics: a biased coin toss. The idea of this problem is to calculate the bias of a coin. Let’s assume that xi = 1 if the result of the i-th coin toss is head and xi = 0 if it is tail. Our context assumes that individual coin tosses are independent and identically distributed (IID) and that each toss follows a Bernoulli distribution with parameter θ: p(xi = 1 | θ) = θ and p(xi = 0 | θ) = 1 − θ. The latent (i.e., unobserved) variable θ is the bias of the coin. The task is to infer θ given the results of previously observed coin tosses, that is, p(θ | x1, x2, . . . , xN ).

为了说明PPL的简单性，让我们使用现代统计中最著名的问题之一：偏向抛硬币。这个问题的想法是计算硬币的偏差。假设第i次抛硬币的结果为正面时xi = 1，如果为尾部则xi = 0。我们的上下文假设单个抛硬币是独立且均匀分布的(IID)，并且每个抛硬币都遵循具有参数θ的伯努利分布：p(xi = 1 |θ)=θ和p(xi = 0 |θ)= 1-θ 。潜变量(即未观察到的变量)是硬币的偏差。任务是根据先前观察到的抛硬币的结果推论θ，即p(θ| x1，x2，...，xN)。

Modeling a simple program like the biased coin toss in a general-purpose programing language can result on hundreds of lines of code. However, PPLs like Edward express this problem in a few simple likes of code:

用通用编程语言对像有偏的抛硬币之类的简单程序进行建模可能会产生数百行代码。但是，像Edward这样的PPL用一些简单的代码来表达这个问题：

# Model 
theta = Uniform(0.0, 1.0) 
x = Bernoulli(probs=theta, sample_shape=10) 
Data 5 data = np.array([0, 1, 0, 0, 0, 0, 0, 0, 0, 1]) 
Inference 
qtheta = Empirical( 8 tf.Variable(tf.ones(1000) ∗ 0.5)) 
inference = ed.HMC({theta: qtheta}, 
data={x: data}) 
inference.run() 
Results 13 mean, stddev = ed.get_session().run( [qtheta.mean(),qtheta.stddev()]) 
print("Posterior mean:", mean) 
print("Posterior stddev:", stddev)

圣杯：深入的PPL (The Holy Grail: Deep PPL)

For decades, the machine learning space was divided in two irreconcilable camps: statistics and neural networks. One camp gave birth to probabilistic programming while the other was behind transformational movements such as deep learning. Recently, the two schools of thought have come together to combine deep learning and Bayesian modeling into single programs. The ultimate expression of this effort is deep probabilistic programming languages(Deep PPLs).

几十年来，机器学习空间被划分为两个不可调和的阵营：统计和神经网络。一个阵营催生了概率编程，而另一个阵营则产生了诸如深度学习之类的变革性运动。最近，这两个思想流派聚集在一起，将深度学习和贝叶斯建模结合到单个程序中。这种努力的最终表达是深度概率编程语言(Deep PPL)。

Conceptually, Deep PPLs can express Bayesian neural networks with probabilistic weights and biases. Practically speaking, Deep PPLs have materialized as new probabilistic languages and libraries that integrate seamlessly with popular deep learning frameworks.

从概念上讲，深度PPL可以表达具有概率权重和偏差的贝叶斯神经网络。实际上，深度PPL已实现为与流行的深度学习框架无缝集成的新概率语言和库。

您需要了解的3个深层PPL (3 Deep PPLs You Need to Know About)

The field of probabilistic programming languages(PPLs) have been exploding with research and innovation in recent years. Most of that innovations have come from combining PPLs and deep learning methods to build neural networks that can efficiently handle uncertainty. Tech giants such as Google, Microsoft or Uber have been responsible for pushing the boundaries of Deep PPLs into large scale scenarios. Those efforts have translated into completely new Deep PPLs stacks that are becoming increasingly popular within the machine learning community. Let’s explore some of the most recent advancements in the Deep PPL space.

近年来，概率编程语言(PPL)领域一直在研究和创新中发展。大多数创新来自将PPL和深度学习方法相结合，以构建可以有效处理不确定性的神经网络。诸如Google，Microsoft或Uber之类的技术巨头一直负责将Deep PPL的边界推向大规模方案。这些努力已经转化为全新的Deep PPL堆栈，这些堆栈在机器学习社区中越来越受欢迎。让我们探索Deep PPL空间中的一些最新进展。

爱德华 (Edward)

Edward is a Turing-complete probabilistic programming language(PPL) written in Python. Edward was originally championed by the Google Brain team but now has an extensive list of contributors. The original research paper of Edward was published in March 2017 and since then the stack has seen a lot of adoption within the machine learning community. Edward fuses three fields: Bayesian statistics and machine learning, deep learning, and probabilistic programming. The library integrates seamlessly with deep learning frameworks such as Keras and TensorFlow.

Edward是一种用Python编写的图灵完备的概率编程语言(PPL)。 Edward最初是Google Brain团队的拥护者，但现在有大量的贡献者。爱德华(Edward )的原始研究论文于2017年3月发表，从那时起，该堆栈在机器学习社区中得到了广泛采用。爱德华融合了三个领域：贝叶斯统计和机器学习，深度学习和概率编程。该库与Keras和TensorFlow等深度学习框架无缝集成。

1 # Model
2 theta = Uniform(0.0, 1.0)
3 x = Bernoulli(probs=theta, sample_shape=10)
4 # Data
5 data = np.array([0, 1, 0, 0, 0, 0, 0, 0, 0, 1])
6 # Inference
7 qtheta = Empirical(
8 tf.Variable(tf.ones(1000) ∗ 0.5))
9 inference = ed.HMC({theta: qtheta},
10 data={x: data})
11 inference.run()
12 # Results
13 mean, stddev = ed.get_session().run(
14 [qtheta.mean(),qtheta.stddev()])
15 print("Posterior mean:", mean)
16 print("Posterior stddev:", stddev)
1 # Inference Guide
2 qalpha = tf.Variable(1.0)
3 qbeta = tf.Variable(1.0)
4 qtheta = Beta(qalpha, qbeta)
5 # Inference
6 inference = ed.KLqp({theta: qtheta}, {x: data})
7 inference.run()

火焰兵 (Pyro)

Pyro is a deep probabilistic programming language(PPL) released by Uber AI Labs. Pyro is built on top of PyTorch and is based on four fundamental principles:

Pyro是由Uber AI Labs发布的一种深度概率编程语言(PPL)。 Pyro建立在PyTorch之上，并基于以下四个基本原则：

Universal: Pyro is a universal PPL — it can represent any computable probability distribution. How? By starting from a universal language with iteration and recursion (arbitrary Python code), and then adding random sampling, observation, and inference.
通用：Pyro是通用PPL-它可以表示任何可计算的概率分布。怎么样？从具有迭代和递归的通用语言(任意Python代码)开始，然后添加随机采样，观察和推断。
Scalable: Pyro scales to large data sets with little overhead above hand-written code. How? By building modern black box optimization techniques, which use mini-batches of data, to approximate inference.
可扩展 ：Pyro可以扩展到大型数据集，而手写代码的开销却很小。怎么样？通过构建使用小批数据的现代黑盒优化技术来近似推断。
Minimal: Pyro is agile and maintainable. How? Pyro is implemented with a small core of powerful, composable abstractions. Wherever possible, the heavy lifting is delegated to PyTorch and other libraries.
最小：Pyro是敏捷且可维护的。怎么样？ Pyro是由强大的可组合抽象的一小部分实现的。尽可能将繁重的工作委托给PyTorch和其他库。
Flexible: Pyro aims for automation when you want it and control when you need it. How? Pyro uses high-level abstractions to express generative and inference models, while allowing experts to easily customize inference.
灵活：Pyro的目标是在需要时实现自动化，并在需要时进行控制。怎么样？ Pyro使用高级抽象来表示生成模型和推理模型，同时使专家可以轻松自定义推理。

Just as other PPLs, Pyro combines deep learning models and statistical inference using a simple syntax as illustrated in the following code:

与其他PPL一样，Pyro使用简单的语法将深度学习模型和统计推断相结合，如以下代码所示：

1 # Model
2 def coin():
3 theta = pyro.sample("theta", Uniform(
4 Variable(torch.Tensor([0])),
5 Variable(torch.Tensor([1])))
6 pyro.sample("x", Bernoulli(
7 theta ∗ Variable(torch.ones(10)))
8 # Data
9 data = {"x": Variable(torch.Tensor(
10 [0, 1, 0, 0, 0, 0, 0, 0, 0, 1]))}
11 # Inference
12 cond = pyro.condition(coin, data=data)
13 sampler = pyro.infer.Importance(cond,
14 num_samples=1000)
15 post = pyro.infer.Marginal(sampler, sites=["theta"])
16 # Result
17 samples = [post()["theta"].data[0] for _ in range(1000)]
18 print("Posterior mean:", np.mean(samples))
19 print("Posterior stddev:", np.std(samples))# Inference Guide
2 def guide():
3 qalpha = pyro.param("qalpha", Variable(torch.Tensor([1.0]), requires_grad=True))
4 qbeta = pyro.param("qbeta", Variable(torch.Tensor([1.0]), requires_grad=True))
5 pyro.sample("theta", Beta(qalpha, qbeta))
6 # Inference
7 svi = SVI(cond, guide, Adam({}), loss="ELBO", num_particles=7)
8 for step in range(1000):
9 svi.step()

推断网 (Infer.Net)

Microsoft recently open sourced Infer.Net a framework that simplifies probabilistic programming for .Net developers. Microsoft Research has been working on Infer.Net since 2004 but it has been only recently, with the emergence of deep learning, that the framework has become really popular. Infer.Net provides some strong differentiators that makes it a strong choice for developers venturing into the Deep PPL space:

微软最近开放了Infer.Net的源代码，该框架简化了.Net开发人员的概率编程。自2004年以来，Microsoft Research一直在研究Infer.Net，但是直到最近，随着深度学习的出现，该框架才真正流行起来。 Infer.Net提供了一些强大的优势，这使其成为进入Deep PPL空间的开发人员的强大选择：

Rich modelling language” Support for univariate and multivariate variables, both continuous and discrete. Models can be constructed from a broad range of factors including arithmetic operations, linear algebra, range and positivity constraints, Boolean operators, Dirichlet-Discrete, Gaussian, and many others.
丰富的建模语言 ”支持连续和离散的单变量和多变量。可以从多种因素构建模型，包括算术运算，线性代数，范围和正性约束，布尔运算符，Dirichlet-Discrete，高斯等。
Multiple inference algorithms” Built-in algorithms include Expectation Propagation, Belief Propagation (a special case of EP), Variational Message Passing and Gibbs sampling.
多个推理算法 ”内置算法包括期望传播，置信传播(EP的特殊情况)，变消息传递和Gibbs抽样。
Designed for large scale inference: Infer.NET compiles models into inference source code which can be executed independently with no overhead. It can also be integrated directly into your application.
专为大规模推理而设计 ：Infer.NET将模型编译成推理源代码，这些代码可以独立执行而不会产生开销。它也可以直接集成到您的应用程序中。
User-extendable: Probability distributions, factors, message operations and inference algorithms can all be added by the user. Infer.NET uses a plug-in architecture which makes it open-ended and adaptable.
用户可扩展 ：用户可以添加概率分布，因子，消息操作和推理算法。 Infer.NET使用一种插件架构，使其具有开放性和适应性。

Lets look at our coin toss example in Infer.Net

让我们看看Infer.Net中的抛硬币示例

Variable<bool> firstCoin = Variable.Bernoulli(0.5);
Variable<bool> secondCoin = Variable.Bernoulli(0.5);
Variable<bool> bothHeads = firstCoin & secondCoin;
InferenceEngine engine = new InferenceEngine();
Console.WriteLine("Probability both coins are heads: "+engine.Infer(bothHeads));

The field of Deep PPL has is steadily becoming an important foundational block of the machine learning ecosystem. Pyro, Edward and Infer.Net are just three recent examples of Deep PPLs but not the only relevant ones. The intersection of deep learning frameworks and PPL offers an incredible large footprint for innovation and new use cases are likely to push the boundaries of Deep PPLs in the near future.

深度PPL领域已稳步成为机器学习生态系统的重要基础块。 Pyro，Edward和Infer.Net只是Deep PPL的三个最新示例，但不是唯一相关的示例。深度学习框架与PPL的交集为创新提供了难以置信的庞大资源，新用例可能会在不久的将来推动Deep PPL的界限。