随机游走算法解决分组问题_随机算法：如何解决竞争解决问题

最新推荐文章于 2022-01-23 21:09:19 发布

cumian9828

最新推荐文章于 2022-01-23 21:09:19 发布

阅读量390

点赞数

文章标签：算法数据库 python java 人工智能

原文链接：https://www.freecodecamp.org/news/randomized-algorithms-part-1-d89986bb685b/

版权

随机游走算法解决分组问题

by Chad Malla

乍得·马拉(Chad Malla)

随机算法：如何解决竞争解决问题 (Randomized Algorithms: how to tackle the Contention Resolution Problem)

Randomized algorithms are very important in the field of theoretical computing science as well as real-world applications. For a lot of problems, to get a deterministic answer, a function that always returns the same answer given the same input is computationally heavy and can’t be solved in polynomial time.

随机算法在理论计算科学以及实际应用中非常重要。对于很多问题，要获得确定的答案，在给定相同输入的情况下始终返回相同答案的函数在计算上很繁琐，无法在多项式时间内求解。

When we introduce some randomness along with the input, we expect to have a more efficient time complexity. Or we expect to have a ratio of the optimal solution with a good upper bound on the number of iterations it will take to get that solution.

当我们在输入中引入一些随机性时，我们期望具有更有效的时间复杂度。或者，我们希望获得最佳解决方案的比率，并获得要获得该解决方案的迭代次数的上限。

These algorithms are often trivial to come up with. But it is much more complex to analyze and prove the running time/correctness. It is worth noting that there is a difference between probabilistic analysis and analysis of randomized algorithms. In probabilistic analysis we give the algorithm an input assumed to be from a probability distribution. Whereas in the randomized algorithm we add a random number(s) to the input. The following images should show the distinction. Images are from Stanford lecture slides.

这些算法通常很简单。但是，分析和证明运行时间/正确性要复杂得多。值得注意的是，概率分析与随机算法分析之间存在差异。在概率分析中，我们为算法提供了一个假设来自概率分布的输入。而在随机算法中，我们向输入添加一个随机数。下图应显示区别。图片来自斯坦福大学的演讲幻灯片。

In this article, I will go through the contention resolution problem from Algorithm Design by Kleinberg and Tardos.

在本文中，我将讨论Kleinberg和Tardos的 Algorithm Design中的竞争解决问题。

争用解决 (Contention Resolution)

The problem is there are n processors that share a single database D and the time is divided into k discrete intervals. The database serves at most 1 processor at a time.

问题是有n个处理器共享一个数据库D ，并且时间划分为k个离散间隔。 该数据库 一次最多可服务1个处理器 。

Goal: Divide the rounds among the n processors in an equitable fashion.

目标：以公平的方式在n个处理器之间分配回合。

Keep in mind there is no communication between the processors of any sort to plan out when to access the database. If every processor keeps repeatedly trying to access D at the same time then the database will be locked out, serving 0 processors. Not ideal. By randomizing the sequence of access attempts from the processors we can “smooth out” the contention, avoiding lockouts.

请记住，任何类型的处理器之间都没有通信，无法计划何时访问数据库。如果每个处理器不断重复尝试同时访问D ，则数据库将被锁定，服务0个处理器。不理想。通过随机化来自处理器的访问尝试序列，我们可以“消除”争用，避免锁定。

大事记 (Events)

We need to specify some events and the probabilities associated with them.

我们需要指定一些事件以及与它们相关的概率。

Think about what is happening here, a processor i is trying to access D at time t. Let that be our first event.

想想这里发生了什么，我正在尝试在时间t访问D的处理器。让那成为我们的第一件事。

E1 = A[i, t]: P[i] (process i) attempts to access D at round t

E1 = A [i，t]： P [i](进程i)尝试在第t轮访问D

This event has a compliment, the process not attempting to access at round t.

此事件是值得称赞的，该过程不在回合t尝试访问。

E1^ = A[i, t]^: P[i] doesn’t attempt to access D at round t

E1 ^ = A [i，t] ^： P [i]在回合t不会尝试访问D

Let A[i, t] have a probability of it occurring be p and since all individual probabilities in a sample space S = {E1, E1^} add up to 1, we have a probability of A[i,t]^ be 1-p.

令A [i，t]发生的概率为p，并且由于样本空间S = {E1，E1 ^}中的所有个体概率加起来为1，因此A [i，t] ^的概率为1-p

Pr[A[i,t]] = p | Pr[A[i,t]^] = 1-p

Pr [A [i，t]] = p | Pr [A [i，t] ^] = 1-p

After attempting to access the database, one of two things happens for process P[i]: it either succeeds or it doesn’t.

尝试访问数据库后，进程P [i]会发生以下两种情况之一：成功或失败。

E2 = S[i,t]: P[i] succeeds in accessing D at round t

E2 = S [i，t]： P [i]在第t轮成功访问D

E2^ = S[i,t]^: P[i] doesn’t succeed in accessing D at round t

E2 ^ = S [i，t] ^： P [i]在回合t未能成功访问D

Success only happens when P[i] is attempting to access D and all other processes aren’t. This is an intersection of events E1 for all the processes.

仅当P [i]尝试访问D而其他所有进程都没有访问时，成功才会发生。这是所有过程的事件E1的交集。

S[i,t] = A[i,t] ∩ (∩j≠i A[j,t]^)

S [i，t] = A [i，t]∩(∩j≠i A [j，t] ^)

The probability of S[i,t] is, therefore, the probability of A[i,t] multiplied with the product of A[j,t]^ complement events.

因此，S [i，t]的概率是A [i，t]的概率乘以A [j，t] ^补充事件的乘积。

Pr[S[i,t]] = Pr[A[i,t]] * ∏j≠i Pr[A[j,t]^] = p(1-p)^(n-1)

Pr [S [i，t]] = Pr [A [i，t]] * ∏j≠i Pr [A [j，t] ^] = p(1-p)^(n-1)

Remember derivatives? They equal 0 at minimums or maximums. Let f(p) = p(1-p)^(n-1) then the derivative of f(p) is

还记得衍生品吗？它们的最小值或最大值等于0。令f(p)= p(1-p)^(n-1)，则f(p)的导数为

f’(p) = (1-p)^(n-1)- (n-1)*p*(1-p)^n-2

f'(p)=(1-p)^(n-1)-(n-1)* p *(1-p)^ n-2

The obvious values that make this equation equal 0 are 0 and 1. When p = 0, none of the processes are attempting to access the database. When it equals 1 then all the processes are attempting at the same time. Both situations are ones we are not interested in. The only other value is when p = 1/n.

使该方程式等于0的显而易见的值是0和1。当p = 0时，没有进程尝试访问数据库。当它等于1时，则所有进程都在同时尝试。两种情况都是我们不感兴趣的。唯一的其他值是p = 1 / n时。

Set p = 1/n and we get

设p = 1 / n ，我们得到

Pr[S[i,t]] = 1/n(1–1/n)^(n-1)

Pr [S [i，t]] = 1 / n(1-1 / n)^(n-1)

From calculus, there are two facts we will use.

从微积分中，我们将使用两个事实。

(1–1/n)^n converges monotonically from 1/4 up to 1/e
(1-1 / n)^ n从1/4到1 / e单调收敛
(1–1/n)^n-1 converges monotonically from 1/2 down to 1/e
(1-1 / n)^ n-1从1/2向下单调收敛到1 / e

So we see there is an asymptotic bound we can use.

因此，我们看到可以使用一个渐近边界。

1/n(1–1/n)^(n-1) converges monotonically from (≤) 1/2 *1/n down to (≥) 1/e*1/n

1 / n(1-1 / n)^(n-1)从(≤)1/2 * 1 / n单调收敛到(≥)1 / e * 1 / n

1/en ≤ Pr[S[i,t]] ≤ 1/2n

1 / en≤Pr [S [i，t]]≤1 / 2n

The prior is asymptotically equal to O(1/n)

先验渐近等于O(1 / n)

Another event, failures…

另一个事件，失败…

E3 = F[i,t]: denotes the “failure event” of P[i] not succeeding accesses to D in any rounds from 1 to t

E3 = F [i，t]：表示P [i]的“失败事件”在从1到t的任何回合中均未成功访问D

That is equivalent to specifying the intersection of events, S[i,r]^ (no success) for r = 1…t

这等效于为r = 1…t指定事件的交集S [i，r] ^(无成功)

This eventually helps the probability of F[i,t] become a nice commutable math equation as the probability of the intersection of events is the product of the individual event probabilities.

最终，由于事件相交的概率是各个事件概率的乘积，因此最终使F [i，t]的概率成为一个很好的可交换数学方程。

Pr[F[i,t]] = Pr[⋂r=1.to.t (S[i,r]^)] = ∏r=1.to.t(Pr[S[i,r]^]) =

Pr [F [i，t]] = Pr [⋂r= 1.to.t(S [i，r] ^)] = ∏r = 1.to.t(Pr [S [i，r] ^] )=

(1-p(1-p)^(n-1))^t

(1-p(1-p)^(n-1))^ t

Pr[S[i,t]] = p(1-p)^(n-1), S[i,t]^ = 1-Pr[S[i,t]]
Pr [S [i，t]] = p(1-p)^(n-1)，S [i，t] ^ = 1-Pr [S [i，t]]

Remember the calculus convergence properties we saw earlier? We use them here to get

还记得我们之前看到的微积分收敛特性吗？我们在这里使用它们来获得

Pr[F[i,t]] = (1-p(1-p)^(n-1))^t = (1–1/n(1–1/n)^(n-1))^t ≤ (1–1/en)^t

Pr [F [i，t]] =(1- p(1-p)^(n-1) )^ t = (1-1 / n(1-1 / n)^(n-1)) ^ t≤(1-1 / en)^ t

p = 1/n because each of the n processors has an equal probability of attempting to access the database at time t
p = 1 / n，因为n个处理器中的每一个在时间t尝试访问数据库的可能性均等

Now let us take a look at parameter t.

现在让我们看一下参数t。

If we set t = ceiling(en) to make sure it is an integer, we get

如果我们设置t = ceiling(en)来确保它是整数，我们得到

Pr[F[i,t]] ≤ (1–1/en)^ceiling(en) ≤ (1–1/en)^en ≤ 1/e

Pr [F [i，t]]≤(1-1 / en)^上限(en)≤(1-1 / en)^ en≤1 / e

This bound tells us that the probability of a process i not succeeding in its attempts from rounds 1 to ceiling(en) is upper-bounded by 1/e, independent of n.

这个界限告诉我们，进程i从第1轮到上限(en)尝试失败的概率上限为1 / e，与n无关。

Set t = ceiling(en)*(c*ln(n)) then we have

设置t =天花板(en)*(c * ln(n))然后我们有

Pr[F[i,t]] ≤ (1–1/en)^t ≤ ((1–1/en)^ceiling(en))^(c*ln(n)) ≤ (1/e)^c*ln(n) ≤ 1/n^c = n^-c

Pr [F [i，t]]≤(1-1 / en)^ t≤((1-1 / en)^天花板(en))^(c * ln(n))≤(1 / e)^ c * ln(n)≤1 / n ^ c = n ^ -c

One last event… our goal is to have the processes succeed as many rounds as possible. In other words if, say,

最后一个事件……我们的目标是使流程尽可能多地成功。换句话说，如果

E4 = F[t]: denotes the event of the protocol failing after t rounds then we would like to minimize t here to maximize the number of rounds it succeeds.

E4 = F [t] ：表示协议在t回合之后失败的事件，那么我们想在这里最小化t以最大程度地增加成功的回合次数。

F[t] essentially occurs if and only if one of F[i,t] occurs, only takes one process failing to say the protocol has failed. This is therefore the union of events F[i,t] for processes i = 1…n.

F [t]本质上是当且仅当 F [i，t]中的一个发生时才发生，仅用一个过程就说协议失败了。因此，这是过程i = 1…n的事件F [i，t]的并集。

F[t] = ⋃i=1.to.n(F[i,t])

F [t] =⋃i= 1.to.n(F [i，t])

联合约束 (Union Bound)

The prior is hard to compute exactly because the events F[i,t] are not independent. The easy solution is bound to them as if they are all independent.

由于事件F [i，t]不是独立的，因此很难精确地计算先验。简单的解决方案将它们绑定在一起，就好像它们都是独立的一样。

Given events E1, … En, we have Pr[⋃i=1.to.n(Ei)] ≤ ∑i=1.to.n(Pr[Ei])

给定事件E1，…En，我们有Pr [⋃i= 1.to.n(Ei)]≤∑i = 1.to.n(Pr [Ei])

Pr[F[t]] ≤ ∑i=1.to.n(Pr[F[i,t]])

Pr [F [t]]≤∑i = 1.to.n(Pr [F [i，t]])

Recall when t = ceiling(en)*c*ln(n) gives an upper bound on Pr[F[i,t]] ≤ n^-c

回想一下，当t = ceiling(en)* c * ln(n)给出Pr [F [i，t]]≤n ^ -c的上限时

Let c = 2 and we have Pr[F[t]] ≤ ∑i=1.to.n(n^(-2)) = n*n^(-2) = 1/n

令c = 2，我们有Pr [F [t]]≤∑i = 1.to.n(n ^(-2))= n * n ^(-2)= 1 / n

What is the probability that all the processes succeed at accessing D at least once within the t = 2*ceiling(en)*ln(n) rounds?

在t = 2 * ceiling(en)* ln(n)个回合内，所有进程至少成功访问D一次的概率是多少？

Take the complement of F[t], F[t]^ and arrive at a probability

取F [t]，F [t] ^的补数，得出概率

1–1/n.

1–1 / n。

结语 (Wrapping up)

This was quite a lengthy analysis, and most analyses for randomized algorithms are. It is essentially a trade-off, as the algorithms are easier to design and understand than complex deterministic algorithms that are computationally heavy arriving at the correct solution. With randomized algorithms, we are willing to accept some small error with the luxury of efficiency.

这是一个冗长的分析，大多数针对随机算法的分析都是如此。从本质上来说，这是一个权衡，因为与复杂的确定性算法相比，这些算法更容易设计和理解，而复杂的确定性算法计算量大，无法找到正确的解决方案。使用随机算法，我们愿意以效率极高的价格接受一些小错误。

Thank you for reading. I am new to the blogging game, so any feedback would be appreciated.

感谢您的阅读。我是博客游戏的新手，因此希望能提供任何反馈。

翻译自: https://www.freecodecamp.org/news/randomized-algorithms-part-1-d89986bb685b/

随机游走算法解决分组问题

cumian9828

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
随机游走算法解决分组问题_随机算法：如何解决竞争解决问题

随机游走算法解决分组问题by Chad Malla 乍得·马拉(Chad Malla) 随机算法：如何解决竞争解决问题 (Randomized Algorithms: how to tackle the Contention Resolution Problem)Randomized algorithms are very important in the field of theoreti...
复制链接

扫一扫