cayenne：用于随机模拟的Python包-CSDN博客

TL;DR; We just released v1.0 of cayenne, our Python package for stochastic simulations! Read on to find out if you should model your system as a stochastic process, and why you should try out cayenne.

TL; DR; 我们刚刚发布了 cayenne v1.0 ，这是我们用于随机模拟的Python包！ 请继续阅读以了解是否应该将系统建模为随机过程，以及为什么要尝试使用 cayenne 。

Mathematical models breathe structure into our understanding of the world. Oftentimes, they are cast as ordinary differential equations that describe how things change with time. In most cases, the insight from such models is sufficient for the question at hand. However, if you are trying to model the number of people who are going to be infected by SARS-CoV-2, a differential equation may not be the right tool for at least a couple of reasons:

数学模型为我们对世界的理解注入了结构。通常，它们被转换为描述事物如何随时间变化的常微分方程。在大多数情况下，来自此类模型的见解足以解决当前的问题。但是，如果您要对将要感染SARS-CoV-2的人数进行建模，则至少有两个原因，微分方程可能不是正确的工具：

Variables in differential equations are continuous and take values like 0.3, 2.7, etc. This isn’t ideal if you are talking about the number of people, which can only be integers like 1 or 2.
微分方程中的变量是连续的，且取值为0.3、2.7等。如果您要谈论人数，那只能是1或2之类的整数，这并不理想。
A differential equation always predicts the same output for a given input. However, not everyone who comes in contact with a COVID-19 infected person will develop the disease — the event is actually probabilistic, or stochastic.
对于给定的输入，微分方程总是预测相同的输出。但是，并非每个与COVID-19感染者接触的人都会患上这种疾病-该事件实际上是概率性的或随机的。

Let’s look at this with a concrete example. Infection and recovery from SARS-CoV-2 can be modeled with Susceptible-Infectious-Recovered (SIR) model. In this model, when an infectious person comes in contact with a susceptible person, the susceptible person gets infected and can now infect others. Eventually, the infected person recovers and cannot spread the disease anymore. These are represented by the equation below:

让我们看一个具体的例子。 SARS-CoV-2的感染和恢复可以用易感感染恢复(SIR)模型进行建模。在此模型中，当感染者与易感者接触时，易感者会被感染，并且现在可以感染他人。最终，感染者康复并且无法再传播疾病。这些由以下等式表示：

Image for post — The infection and recovery process

Giving some numbers for the parameters “k1” and “k2”, we can simulate the model as an ordinary differential equation in Python like this

给参数“ k1”和“ k2”一些数字，我们可以像使用Python这样将模型模拟为一个常微分方程。

If you plot the result of the simulation, it will look like this:

如果绘制模拟结果，它将看起来像这样：

You can see that i) the number of people is being represented by fractional values & ii) there is no notion of “probability” in the plot, it is just a smooth curve vs. time.

您可以看到，i)用分数值表示人数； ii)在图中没有“概率”的概念，它只是一条随时间变化的平滑曲线。

To address these limitations of differential equations, we use the more elegant built-for-purpose mathematical construct called a Markov jump process. Such a process is well suited for something like modeling the number of people. When the number of infected people increases from 5 to 6, there are instantaneous “jumps” between the integer values, instead of the gradual change (e.g. 5.1, 5.2, …, 6) seen in differential equations. Below we see the same SIR model from above, but simulated as a Markov jump process.

为了解决微分方程的这些局限性，我们使用了更为优雅的针对用途的内置数学构造，称为马尔可夫跳跃过程。这样的过程非常适合于模拟人数。当被感染人数从5增加到6时，整数值之间会出现瞬时“跳跃”，而不是微分方程式中出现的逐渐变化(例如5.1、5.2，…，6)。在下面，我们从上方看到了相同的SIR模型，但以Markov跳跃过程进行了仿真。

The number of people in this Markov process jumps between integer values at random time-points, at a frequency determined the rate constants k1 and k2. The simulation stops when the number of infected individuals is zero, around time= 70 days. Repeating this simulation with a different random number seed gives a different simulation trajectory, but with the same general trend:

该马尔可夫过程中的人数在随机时间点的整数值之间跳跃，其频率确定为速率常数k1和k2。当感染个体的数量为零时(大约70天左右)，模拟停止。使用不同的随机数种子重复此模拟将给出不同的模拟轨迹，但具有相同的总体趋势：

Here we see that the infection ends quickly (t=25) as the number of infected individuals becomes zero sooner. Since each simulation trajectory is probabilistic, we generally run a large number of them to get an overall picture of the process.

在这里，我们看到感染很快结束(t = 25)，因为被感染的人数很快变为零。由于每个仿真轨迹都是概率性的，因此我们通常会运行大量仿真轨迹以获取整个过程的概况。

For simulating these Markov jump processes (also known as continuous-time Markov chains), we have developed an accurate, easy-to-use Python package called cayenne. You should be able to install it easily from here. And below is the minimal code for simulating the SIR model 10 times and plotting the simulation trajectories.

为了模拟这些Markov跳跃过程(也称为连续时间Markov链)，我们开发了一个准确，易于使用的Python软件包，称为cayenne 。您应该可以从这里轻松安装它。下面是用于模拟SIR模型10次并绘制模拟轨迹的最小代码。

If you are someone who runs stochastic simulations, we think you will enjoy trying out our package (get it here, examples here and a tutorial here) if you care about:

如果您是运行随机模拟的人，那么如果您关心以下方面，我们认为您会喜欢尝试一下我们的软件包的( 在此处获取示例，在此处提供示例，并在此处获得教程)。

Accuracy: We tested all our algorithms for accuracy using the stochastic SBML test suite. Other packages that we tested were not as accurate as ours, as we find in our benchmarks.
准确性 ：我们使用随机SBML测试套件测试了所有算法的准确性。正如我们在基准测试中发现的那样，我们测试的其他软件包的准确性不如我们的软件包。
Fast code: Our backend is written in Cython for a nice balance between speed and ease of writing new algorithms. And different repetitions of the algorithm can be run across multiple CPU cores out of the box.
快速代码 ：我们的后端使用Cython编写，可以在速度和编写新算法的便利性之间取得很好的平衡。而且该算法的不同重复可以在多个CPU核心中运行。
Quick prototyping: We leverage the cool antimony library to provide an easy and intuitive model writing interface. There is no need to write out the stoichiometric matrices.
快速原型制作 ：我们利用出色的锑库提供简单直观的模型编写界面。无需写出化学计量矩阵。
The little things: Stochastic simulations usually mean outputs logged at stochastic time points (e.g. at t = 19.3, 19.7). But with cayenne, you can get an accurately interpolated value at the time points of your choice (e.g. at t = 19.5).
小事情 ：随机模拟通常意味着在随机时间点记录的输出(例如，在t = 19.3，19.7时)。但是，使用cayenne ，您可以在选择的时间点获得准确的插值(例如，t = 19.5)。