贝叶斯 定理_蚂蚁骰子和贝叶斯定理

贝叶斯 定理

We have so far solved a few algorithmic/coding problems asked in programming interviews across some of the best companies in the world. Today we are introducing a new series solving Data Science interview questions at these same companies. Many of these places do not have a specialized role for Data Scientists (although that’s changing rapidly). For the rest, a Data Scientist typically solves a logical/mathematical puzzle on a whiteboard. Few of these interviews also ask deep/technical questions on SQL. The rest of the interviews are almost identical to the algorithmic/coding interviews that we have seen so far in this blog. Today, we will solve three simple problems asked at Facebook for a Data Science role. Read along…

到目前为止,我们已经解决了全球一些最佳公司在编程采访中提出的一些算法/编码问题。 今天,我们推出了一个新系列,解决了这些公司对数据科学面试提出的问题。 这些地方中的许多地方对数据科学家没有专门的作用(尽管情况正在Swift改变)。 其余的数据科学家通常会在白板上解决逻辑/数学难题。 这些访谈中很少有人会问有关SQL的深层/技术性问题。 其余的采访几乎与我们迄今为止在此博客中看到的算法/编码采访相同。 今天,我们将解决在Facebook上要求担任数据科学职位的三个简单问题。 阅读...

问题一: (Problem 1:)

Three ants are sitting at the corners of an equilateral triangle. Each ant randomly picks a direction and starts moving along the edge of the triangle. What is the probability that none of the ants collide? Now, what if it is k ants on all k corners of an equilateral polygon?

三只蚂蚁坐在等边三角形的角上。 每个蚂蚁随机地选择一个方向,并开始沿着三角形的边缘移动。 没有一只蚂蚁发生碰撞的概率是多少? 现在,如果等边多边形的所有k个角上都是k个蚂蚁,该怎么办?

(Solution)

Let’s start simple, shall we? Here are two ways of solving this problem, both should be obvious after a little thought:

让我们从简单开始吧? 解决此问题的方法有两种,经过一番思考,两种方法都应该显而易见:

  1. How many ways can an ant (at any vertex) move? Precisely two!. All the movements of all three ants are independent are each other. As a result, the total number of movements for all three ants is equal to 2³= 8. How many of these movements result in no-collision? Again, precisely two! (1) All ants moving clockwise, and (2) all ants moving counter-clockwise. As a result, the probability of no collision is simply 2/8 = 0.25. How about k ants? There are again two ways a single ant can move, and hence, k ants can move in 2ᵏ different ways. Just like in the case of a triangle, there are precisely two movements resulting in non-collisions. Again, (1) All ants moving clockwise, and (2) all ants moving counter-clockwise. As a result, the probability of non-collision is simply 1/(2ᵏ-¹).

    蚂蚁(在任何顶点)可以移动多少种方式? 正好两个! 所有三只蚂蚁的运动都是相互独立的。 结果,所有三只蚂蚁的移动总数等于2³=8。这些移动中有多少导致没有碰撞? 同样,恰好两个! (1)所有蚂蚁都顺时针移动,并且(2)所有蚂蚁都逆时针移动。 结果,没有碰撞的可能性仅为2/8 = 0.25。 蚂蚁怎么样? 单个蚂蚁可以通过两种方式移动,因此,k蚂蚁可以以2种不同的方式移动。 就像三角形一样,恰好有两个运动导致没有碰撞。 同样,(1)所有蚂蚁都顺时针移动,(2)所有蚂蚁都逆时针移动。 结果,非冲突的可能性仅为1 /(2 -1-1)。
  2. The above method was frequency counting, where you count the number of ways the event of interest occurs, and divide it by the total number of ways. Let’s directly calculate the probability of non-collision by calculating the probabilities of two independent and mutually exclusive events.

    上面的方法是频率计数,您可以在其中计算关注事件发生的方式数,然后将其除以总数。 通过计算两个独立且互斥的事件的概率,直接计算非冲突的概率

  • All ants moving clockwise: The probability of one ant moving clockwise is 1/2. Since all ants move independently, the probability of k ants moving clockwise is (1/2)ᵏ

    所有蚂蚁顺时针移动:一只蚂蚁顺时针移动的概率为1/2。 由于所有蚂蚁都独立运动,所以k蚂蚁顺时针运动的概率为(1/2)ᵏ

  • All ants moving counter-clockwise: By similar logic, the probability of all ants moving counter-clockwise is also (1/2)ᵏ.

    所有蚂蚁逆时针移动:按照类似的逻辑,所有蚂蚁逆时针移动的概率也为(1/2)ᵏ。

Because the non-collision occurs when all ants either move clockwise or counter-clockwise, the two events above are independent and mutually exclusive. The total probability of non-collision is simply, their sum, i.e. (1/2)ᵏ + (1/2)ᵏ = (1/2)ᵏ-¹. For a triangle (k=3), the probability is (1/2)³-¹ = 0.25

因为当所有蚂蚁顺时针或逆时针移动时都会发生非冲突,所以上述两个事件是独立的并且互斥的。 非冲突的总概率就是它们的总和,即(1/2)ᵏ+(1/2)ᵏ=(1/2)ᵏ-1。 对于三角形(k = 3),概率为(1/2)³-¹= 0.25

问题2: (Problem 2:)

Say you roll three dice and observe the sum of the three rolls. What is the probability that the sum of the outcomes is 12, given that the three rolls are different?

假设您掷三个骰子,并观察三个掷骰的总和。 假设这三卷不同,那么结果总和为12的概率是多少?

解: (Solution:)

Let’s solve this problem in two different ways as well!

让我们以两种不同的方式解决这个问题!

方法1: (Method 1:)

Like before, let’s employ frequency counting. The denominator in our frequency counting is the total number of ways the three dice roll differently. There are 6 possible outcomes for the first die. But once we fix it to a value, there are only 5 possible outcomes for a second die (to avoid repetition). Finally, once the first two dice have fix values, there only remain 4 possible outcomes for the third die. As a result, our denominator is simply 4 * 5 * 6 = 120.

和以前一样,让我们​​使用频率计数。 我们的频率计数中的分母是三个骰子滚动方式不同的总数。 第一个死亡有6种可能的结果。 但是,一旦我们将其固定为一个值,则第二次死亡只有5种可能的结果(避免重复)。 最后,一旦前两个骰子具有固定值,则第三个骰子仅剩下4个可能的结果。 结果,我们的分母就是4 * 5 * 6 = 120。

What about the numerator? Here, the simplest thing we can do is to enumerate all possible outcomes that add up to 12, taking care to avoid the outcomes that contain repetitions.

分子呢? 在这里,我们能做的最简单的事情就是枚举所有可能合计为12的结果,并注意避免包含重复的结果。

(1, 5, 6), (1, 6, 5) 
(2, 4, 6), (2, 6, 4)
(3, 4, 5), (3, 5, 4) (4, 2, 6),
(4, 3, 5), (4, 5, 3), (4, 6, 2)
(5, 1, 6), (5, 3, 4), (5, 4, 3), (5, 6, 1)
(6, 1, 5), (6, 2, 4), (6, 4, 2), (6, 5, 1)

There are 18 total ways to get an outcome of 12. Thus, the total probability is 18/120 = 3/20.

总共有18种方法得出12的结果。因此,总概率为18/120 = 3/20。

方法2: (Method 2:)

The second method is going to use the definition of the conditional probability: P(A|B) = P(A∩B)/P(B). For this problem, A is the event where the sum of three dice is 12. B is the event where the dice outcome is different. What is P(B)? Just like the above method, there are 120 total outcomes with different dice values, out of a total of 6³=216. Hence, P(B) = 120/216 = 5/9.

第二种方法将使用条件概率的定义 P(A | B)= P(A∩B)/ P(B)。 对于此问题,A是三个骰子的总和为12的事件。B是骰子结果不同的事件。 什么是P(B)? 与上述方法一样,在6³= 216的总数中,共有120个具有不同骰子值的结果。 因此,P(B)= 120/216 = 5/9。

How about P(A∩B)? Again, just like the previous calculation, there are 18 possible outcomes (out of 216) for the sum to be equal to 12 and the dice values to be different. Hence, P(A∩B) = 18/216 = 1/12. Thus, P(A|B) = (1/12)/(5/9)= 3/20.

P(A∩B)呢? 同样,就像之前的计算一样,有18种可能的结果(从216个中得出),总和等于12 且骰子值不同。 因此,P(A∩B)= 18/216 = 1/12。 因此,P(A | B)=(1/12)/(5/9)= 3/20。

问题三: (Problem 3:)

Facebook has a content team that labels pieces of content on the platform as spam or not spam. 90% of them are diligent raters and will label 20% of the content as spam and 80% as non-spam. The remaining 10% are non-diligent raters and will label 0% of the content as spam and 100% as non-spam. Assume the pieces of content are labeled independently from one another, for every rater. Given that a rater has labeled 4 pieces of content as good, what is the probability that they are a diligent rater?

Facebook有一个内容小组,将平台上的内容标记为垃圾邮件或非垃圾邮件。 其中90%是勤奋的评分者,会将20%的内容标记为垃圾邮件,将80%的内容标记为非垃圾邮件。 剩下的10%是非勤奋的评分者,会将0%的内容标记为垃圾邮件,将100%的内容标记为非垃圾邮件。 假设对于每个评估者,内容的内容彼此独立地标记。 假设评估者将4项内容标记为好,那么他们是勤奋的评估者的概率是多少?

解: (Solution:)

This is a straightforward application of the Bayes Theorem. Let’s recall how the Bayes Theorem works here.

这是贝叶斯定理的直接应用。 让我们回想一下贝叶斯定理在这里是如何工作的。

Let’s assume that D is the event that the rater is diligent, and G is the event that 4 independent pieces of content are labeled as good. Then, by Bayes theorem, the conditional probability in the question is given by:

假设D是评估者勤奋的事件,G是4个独立的内容被标记为良好的事件。 然后,根据贝叶斯定理,问题中的条件概率由下式给出:

P(D|G) = (P(G|D) * P(D)) / P(G).

Let’s calculate each term of the formula above:

让我们计算上面公式的每个项:

  1. P(D) is simply the probability that a rater is diligent, which is given to be 90% or 0.9

    P(D)只是评估者勤奋工作的概率,该概率为90%或0.9

  2. P(G|D) is the probability that a diligent rater rates 4 pieces of content as good. A diligent rater has a 0.8 probability of rating a single piece as good. Since the pieces are rated independently, P(G|D) = 0.84= 0.4096.

    P(G | D)是勤奋的评分者对4个内容进行评分的概率。 勤奋的评估者将单件产品评为好的概率为0.8。 由于块是独立评估的,因此P(G | D)= 0.84 = 0.4096。

  3. P(G) is the toughest of the three to calculate. P(G) is simply the probability that 4 pieces of content are rated as good. (by any rater). The rater rating the pieces could be a diligent one or a non-diligent one. Let’s assume that N is the event that the rater is a non-diligent one. Since rating by diligent and non-diligent raters is independent and mutually exclusive, we can simply add the probabilities of rating by the two types of raters.

    P(G)是这三个值中最难计算的。 P(G)只是将4个内容分级为好的概率。 (由任何评估者)。 评分者对作品的评分可以是勤奋的,也可以是非勤奋的。 假设N是评估者是非勤奋者的事件。 由于勤奋和非勤奋的评估者的评级是独立且互斥的,因此我们可以简单地将两种评估者的评级概率相加。

P(G) = P(G ∩ D) + P(G ∩ N)

Finally, by using the formula for conditional probabilities from above,

最后,通过使用上面的条件概率公式,

P(G) = P(G | D) * P(D) + P(G | N) * P(N)

Now we have all the ingredients to compute P(G)! Recall that P(N) = 0.1, and P(G|N)=1, since non-diligent raters rate everything as good. Thus,

现在我们拥有了计算P(G)的所有要素! 回想一下,P(N)= 0.1,而P(G | N)= 1,因为非勤奋的评估者对所有项目的评价都一样。 从而,

P(G) = P(G | D) * P(D) + P(G | N) * P(N) 
= 0.4096 * 0.9 + 1 * 0.1
= 0.46864

Finally,

最后,

P(D|G) = (P(G|D) * P(D)) / P(G) 
= (0.4096 * 0.9) / 0.46864
= 0.7867

Originally published at https://cppcodingzen.com on September 20, 2020.

最初于2020年9月20日发布在https://cppcodingzen.com上。

翻译自: https://medium.com/@cppcodingzen/ants-dice-and-bayes-theorem-607ca06d7a64

贝叶斯 定理

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值