[Probability] Conditional Probability

Conditional Probability:

If A and B are events with P(B) > 0, then the conditional probability of A given B, denoted by P(A|B), is defined as P(A|B) = P(A ∩ B) / P(B)

Two Children:

(Two children). Martin Gardner posed the following puzzle in the 1950s, in his column in Scientific American.

Mr. Jones has two children. The older child is a girl.

What is the probability that both children are girls?   1/2

Mr. Smith has two children. At least one of them is a boy.

What is the probability that both children are boys?  1/3

The possible combinations for the two children are: {BB, BG, GB, GG} where B represents a boy and G represents a girl.

At first glance this problem seems like it should be a simple application of conditional probability, but for decades there have been controversies about whether or why the two parts of the problem should have different answers, and the extent to which the problem is ambiguous. Gardner gave the answers 1/2 and 1/3 to the two parts, respectively, which may seem paradoxical: why should it matter whether we learn the older child’s gender, as opposed to just learning one child’s gender?

It is important to clarify the assumptions of the problem. Several implicit assumptions are being made to obtain the answers that Gardner gave.

• It assumes that gender is binary, so that each child can be definitively categorized as a boy or a girl. In fact, many people don’t neatly fit into either of the categories “male” or “female”, and identify themselves as having a non-binary gender.

• It assumes that P(boy) = P(girl), both for the elder child and for the younger child. In fact, in most countries slightly more boys are born than girls. For example, in the United States it is commonly estimated that 105 boys are born for every 100 girls who are born.

• It assumes that the genders of the two children are independent, i.e., knowing the elder child’s gender gives no information about the younger child’s gender, and vice versa. This would be unrealistic if, e.g., the children were identical twins.

Think about: 

Suppose we have a winter-born boy, what's the prob. that both are boys?

Let's define the events:

  • Event A: Both children are boys. P(A) = 1/4
  • Event B: One child is a winter-born boy. P(B) = 1 - P(C) = 15 / 64
  • Event C : Both are not winter-born boy. P(C) = ((7/8)^2) = 49 / 64
  • P( A&B ) = 7 / 64 (notice that A & B are not independent )
  • P( A&B )/ P(B) = 7 / 15

We want to find P(A∣B), the probability that both children are boys given that one of them is a winter-born boy.

Don't trust your intuitions !

Think about : P ( Both are boys, and there's a winter - born child ) is equal to 

P ( Both are boys)* P( there is a winter born child )= 1/4 * 7/16 

Because the two are independent

Independence

P(A&B) = P(A)* P(B) iff A & B are independent

if P (A) > 0, P ( A | B ) = P ( A )

like : 

A: flip a coin and get "H"

B : you get an A in the course. 

(Independence of complementary events)

If A and B are independent, then A and Bc are independent, Ac and B are independent, and Ac and Bc are independent.

Proof. Let A and B be independent. We will first show that A and Bc are independent.

If P(A) = 0, then A is independent of every event, including Bc .

So assume P(A) != 0. Then P(B c |A) = 1 − P(B|A) = 1 − P(B) = P(B c ),

(Independence of three events).

Events A, B, and C are said to be independent if all of the following equations hold:

P(A ∩ B) = P(A)P(B),

P(A ∩ C) = P(A)P(C),

P(B ∩ C) = P(B)P(C),

P(A ∩ B ∩ C) = P(A)P(B)P(C).

On the other hand, P(A ∩ B ∩ C) = P(A)P(B)P(C) does not imply pairwise independence; this can be seen quickly by looking at the extreme case P(A) = 0, when the equation becomes 0 = 0, which tells us nothing about B and C. We can define independence of any number of events similarly. 

Q: whether... A: Yes/ No. (Binary info) If A and B are independence, knowing A would not give insights of B. 

What about A and B and C? A, B independent, to make sure C is independent of A and B, 

Q1: whether A...

Q2: whether B...

The combinations of the answers of Q1 & Q2 does not affect the event C, which are all needed to be checked. But it turned out that we only need the four equations above.

For more, you can prove with similar equations which can be proved by using complementary result and prove the kth event does not affect others.

(Pairwise independence doesn’t imply independence)

(Conditional independence).

Events A and B are said to be conditionally independent given E if P(A ∩ B|E) = P(A|E)P(B|E).

but

It is easy to make terrible blunders stemming from confusing independence and conditional independence. Two events can be conditionally independent given E, but not independent given Ec .

(Conditional independence doesn’t imply independence). 

(Probability of the intersection of two events).

For any events A and B with positive probabilities, P(A ∩ B) = P(B)P(A|B) = P(A)P(B|A)

Bayes' Rule:

Bayes' theorem, also known as Bayes' rule or Bayes' law, is a fundamental concept in probability theory. It provides a way to update probabilities based on new evidence or information. The theorem is named after the Reverend Thomas Bayes, an 18th-century mathematician and statistician.

The formula for Bayes' theorem is expressed as follows:

P(A|B) = P(B|A) *P(A) / P(B)

with 

P(A|B) = P(A ∩ B) / P(B)

Here's the breakdown of the terms:

- \( P(A|B) \): The probability of event A occurring given that event B has occurred. This is the posterior probability.
- \( P(B|A) \): The probability of event B occurring given that event A has occurred. This is the likelihood.
- \( P(A) \): The probability of event A occurring. This is the prior probability.
- \( P(B) \): The probability of event B occurring. This is the marginal likelihood or the total probability of event B.

In words, Bayes' theorem states that the probability of A given B is proportional to the probability of B given A, multiplied by the prior probability of A, and then normalized by the overall probability of B.

Bayes' theorem is particularly useful in Bayesian statistics and machine learning, where it is employed in various applications such as Bayesian inference, spam filtering, medical diagnosis, and more. It allows for the incorporation of new evidence to update prior beliefs and make more informed decisions.

(Odds).

The odds of an event A are

odds(A) = P(A)/P(A c ).

For example, if P(A) = 2/3, we say the odds in favor of A are 2 to 1. (This is sometimes written as 2 : 1, and is sometimes stated as 1 to 2 odds against A

Of course we can also convert from odds back to probability:

P(A) = odds(A)/(1 + odds(A))

  • 19
    点赞
  • 24
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
import numpy as np def loaddata(): X = np.array([[1,'S'],[1,'M'],[1,'M'],[1,'S'], [1, 'S'], [2, 'S'], [2, 'M'], [2, 'M'], [2, 'L'], [2, 'L'], [3, 'L'], [3, 'M'], [3, 'M'], [3, 'L'], [3, 'L']]) y = np.array([-1,-1,1,1,-1,-1,-1,1,1,1,1,1,1,1,-1]) return X, y def Train(trainset,train_labels): m = trainset.shape[0] n = trainset.shape[1] prior_probability = {}# 先验概率 key是类别值,value是类别的概率值 conditional_probability ={}# 条件概率 key的构造:类别,特征,特征值 #类别的可能取值 labels = set(train_labels) # 计算先验概率(此时没有除以总数据量m) for label in labels: prior_probability[label] = len(train_labels[train_labels == label])+1 #计算条件概率 for i in range(m): for j in range(n): # key的构造:类别,特征,特征值 #补充计算条件概率的代码-1; key = str(train_labels[i])+','+str(j)+','+str(trainset[i][j]) conditional_probability[key] = (conditional_probability[key]+1 if (key in conditional_probability) else 1) conditional_probability_final = {} for key in conditional_probability: #补充计算条件概率的代码-2; label = key.split(',')[0] conditional_probability[key]+=1 key1 = int(key.split(',')[1]) Ni = len(set(trainset[:,key1])) conditional_probability_final[key] =conditional_probability[key]/(prior_probability[int(label)]+Ni) # 最终的先验概率(此时除以总数据量m) for label in labels: prior_probability[label] = prior_probability[label]/ (m+len(labels)) return prior_probability,conditional_probability_final,labels def predict(data): result={} for label in train_labels_set: temp=1.0 #补充预测代码; print('result=',result) #排序返回标签值 result[label] = temp*prior_probability[label] for i in range (len(data)): key = str(label)+ ','+str(i)+','+str(data[i]) result[label]*=conditional_probability_final[key] print('result=',result) #排序返回标签值 return sorted(result.items(), key=lambda x: x[1],reverse=True)[0][0] X,y = loaddata() prior_probability,conditional_probability,train_labels_set = Train(X,y) r_label = predict([2,'S']) print(' r_label =', r_label)运行次python代码
06-07

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值