浅谈概率DP

浅谈概率DP

经典例题: 一个软件有s个子系统,会产生n种bug。 某人一天发现一个bug,这个bug属于某种bug,发生在某个子系统中。 求找到所有的n种bug,且每个子系统都找到bug,这样所要的天数的期望。 需要注意的是:bug的数量是无穷大的,所以发现一个bug,出现在某个子系统的概率是1/s, 属于某种类型的概率是1/n。

解法:

dp[i][j]表示已经找到i种bug,并存在于j个子系统中,要达到目标状态的天数的期望。
显然,dp[n][s]=0,因为已经达到目标了。而dp[0][0]就是我们要求的答案。
dp[i][j]状态可以转化成以下四种:
dp[i][j] 发现一个bug属于已经找到的i种bug和j个子系统中
dp[i+1][j] 发现一个bug属于新的一种bug,但属于已经找到的j种子系统
dp[i][j+1] 发现一个bug属于已经找到的i种bug,但属于新的子系统
dp[i+1][j+1]发现一个bug属于新的一种bug和新的一个子系统

以上四种的概率分别为:
p1 = i* j / (n * s)
p2 = (n-i) * j / (n * s)
p3 = i * (s-j) / (n * s)
p4 = (n-i) * (s-j) / (n * s)

又有:期望可以分解成多个子期望的加权和,权为子期望发生的概率,即 E(aA+bB+…) = aE(A) + bE(B) +…

所以:
dp[i,j] = p1 * dp[i,j] + p2 * dp[i+1,j] + p3 * dp[i,j+1] + p4 * dp[i+1,j+1] + 1;

整理得:
dp[i,j] = ( 1 + p2 * dp[i+1,j] + p3 * dp[i,j+1] + p4 * dp[i+1,j+1] )/( 1-p1 )
= ( n*s + (n-i)j * dp[i+1,j] + i(s-j)dp[i,j+1] + (9n-i)(s-j)*dp[i+1,j+1] )/( n * s - i * j );

例题一:

Given a dice with n sides, you have to find the expected number of times you have to throw that dice to see all its faces at least once. Assume that the dice is fair, that means when you throw the dice, the probability of occurring any face is equal.
For example, for a fair two sided coin, the result is 3. Because when you first throw the coin, you will definitely see a new face. If you throw the coin again, the chance of getting the opposite side is 0.5, and the chance of getting the same side is 0.5. So, the result is
1 + (1 + 0.5 * (1 + 0.5 * …))
= 2 + 0.5 + 0.52 + 0.53 + …
= 2 + 1 = 3
Input
Input starts with an integer T (≤ 100), denoting the number of test cases.
Each case starts with a line containing an integer n (1 ≤ n ≤ 105).
Output
For each case, print the case number and the expected number of times you have to throw the dice to see all its faces at least once. Errors less than 10-6 will be ignored.
Sample Input

5
1
2
3
6
100

Sample Output

Case 1: 1
Case 2: 3

按上面的解题思想:
设dp[i],i代表还剩下几面未扔出,dp[i]表示达到这一步的期望。
下一个扔出的骰子有两种可能性:
扔出出现过的p1=(n-i)/n * p[i+1];
扔出未出现的p2=i/n * p[i];
所以p[i]=(n-1)/n * p[i+1]+i/n * p[i]+1;
化简得:
dp[i]=dp[i+1]+n/(n-i);
因为dp[n]=0;
我们要求dp[0].
代码如下:

#include <iostream>
#include <cstring>
#include <algorithm>
#include <queue>
using namespace std;
int main() {
    int t,v=1,n;
    cin>>t;
    while(t--) {
        cin>>n;
        double ans=0;
        for(int k=0;k<n;k++) {
            ans+=1+1.0*k/(n-k);
        }
        printf("Case %d: %.8f\n", v++, ans);
    }
    return 0;
} 

这道题还有另一种方法:
(几何分布法)

(复习几何分布:概率为p的事件A,以X记A首次发生所进行的试验次数,则X的分布列:在这里插入图片描述
具有这种分布列的随机变量X,称为服从参数p的几何分布,记为X~Geo§。
几何分布的期望 E[x]=1/p;)

p0=0;E[0]=0;
p1=(n-1)/n;E[1]=n/(n-1);
p2=(n-2)/n;E[2]=n/(n-2);

ans=E[1]+E[2]+E[3]+…;

#include <iostream>
#include <cstring>
#include <algorithm>
#include <queue>
using namespace std;

int n;
double a[100000+5];

int main(){
  int T;
  cin>>T;
  for(int f=1;f<=T;f++){
    cin>>n;
    a[n]=0.0;
    for(int i=n-1;i>=0;i--){
        a[i]=a[i+1]+(double)n/(n-i);
    }
    printf("Case %d: %.6f\n",f,a[0]);
  }
}

例题2:

You are in a cave, a long cave! The cave can be represented by a 1 x N grid. Each cell of the cave can contain any amount of gold.
Initially you are in position 1. Now each turn you throw a perfect 6 sided dice. If you get X in the dice after throwing, you add X to your position and collect all the gold from the new position. If your new position is outside the cave, then you keep throwing again until you get a suitable result. When you reach the Nth position you stop your journey. Now you are given the information about the cave, you have to find out the expected number of gold you can collect using the given procedure.
Input
Input starts with an integer T (≤ 100), denoting the number of test cases.
Each case contains a blank line and an integer N (1 ≤ N ≤ 100) denoting the dimension of the cave. The next line contains N space separated integers. The ith integer of this line denotes the amount of gold you will get if you come to the ith cell. You may safely assume that all the given integers will be non-negative and no integer will be greater than 1000.
Output
For each case, print the case number and the expected number of gold you will collect. Errors less than 10-6 will be ignored.
Sample Input

3
1
101
2
10 3
3
3 6 9

Sample Output

Case 1: 101.0000000000
Case 2: 13.000
Case 3: 15

大意:在每一个数你都可以仍骰子,决定下一步走到哪儿,但是你要是扔的数超了这个数组就重新仍,直到到n为止。
设dp[i]为到达i个格子获得gold值的期望,则有dp[i] = (dp[i+1]/6 + … +dp[i+6]/6) + dp[i]。(由于此题和以上的题目有一定差别,因为每一步都有自己的gold,所以这里不加1,而是加dp[i])
初始化dp[i] = 第i个格子的gold值。
当剩下的不足6步则要另作打算。
代码如下:

#include <iostream>
#include <cstring>
#include <algorithm>
#include <queue>
using namespace std;

int main(){
 int T;
cin>>T;
int o=0;
while(T--){
    int n;
    cin>>n;
    int a[105];
    memset(a,0,sizeof(a));
    for(int i=n-1;i>=0;i--){
        cin>>a[i];
    }
    printf("Case %d: ",++o);
    double dp[105];
    dp[0]=a[0];
    for(int i=0;i<n;i++){
    dp[i]=a[i];
    if(i>=6) for(int j=1;j<=6;j++) dp[i]+=dp[i-j]/6;
        else for(int j=1;j<=i;j++) dp[i]+=dp[i-j]/i;
    }
    printf("%.7f\n",dp[n-1]);
 }return 0;
}

暂时先介绍这么多啦<( ̄︶ ̄)↗[GO!]。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值