HDU_5781_ATM_Mechine（概率期望dp）

最新推荐文章于 2018-08-16 18:58:57 发布

_OTTFF

最新推荐文章于 2018-08-16 18:58:57 发布

阅读量401

点赞数

分类专栏： HDU //========DP======== 概率期望dp 经典问题

本文链接：https://blog.csdn.net/baidu_29410909/article/details/52101945

版权

HDU 同时被 3 个专栏收录

56 篇文章 0 订阅

订阅专栏

//========DP========

43 篇文章 0 订阅

订阅专栏

经典问题

11 篇文章 0 订阅

订阅专栏

ATM Mechine

Time Limit: 6000/3000 MS (Java/Others) Memory Limit: 65536/65536 K (Java/Others)
Total Submission(s): 512 Accepted Submission(s): 221

Problem Description

Alice is going to take all her savings out of the ATM(Automatic Teller Machine). Alice forget how many deposit she has, and this strange ATM doesn't support query deposit. The only information Alice knows about her deposit is the upper bound is K RMB(that means Alice's deposit x is a random integer between 0 and K (inclusively)).
Every time Alice can try to take some money y out of the ATM. if her deposit is not small than y, ATM will give Alice y RMB immediately. But if her deposit is small than y, Alice will receive a warning from the ATM.
If Alice has been warning more then W times, she will be taken away by the police as a thief.
Alice hopes to operate as few times as possible.
As Alice is clever enough, she always take the best strategy.
Please calculate the expectation times that Alice takes all her savings out of the ATM and goes home, and not be taken away by the police.

Input

The input contains multiple test cases.
Each test case contains two numbers K and W.

1≤K,W≤2000

Output

For each test case output the answer, rounded to 6 decimal places.

Sample Input

Sample Output

Author

ZSTU

Source

2016 Multi-University Training Contest 5

Recommend

wange2014

题意

有个非常机智的女性，在不知道自己银行卡里具体有多少钱，而只知道一个上限K的情况下

即钱数可能为[0,N]中任意整数值。

总能选择最优策略取钱。

取钱时有一个最大的允许错误次数w。

如果目前取金额小于剩余金额则能取出，否则不能取出钱，回复一个错误。

那么问给定k和w，在最优策略下的取钱次数期望。

解题思路

这个题目其实读懂还是有难度的。

k=1 w=1的时候

这时显然最优决策是取1

如果报错就知道自己有0元

否则有1元就取出了

则期望是（1+1）/2=1

对于大一些的k和w

E[k][w]表示此时的最优期望

则可以想到，最优决策一定是选择了1~k之间的一个值

假定选择了i

那么1~k的区间就被分成2种情况

如果当前钱数是小于i则期望就是转化到E[i-1][w-1]*i/(k+1)

如果当前钱数大于等于i则I被取出而且不消耗次数因此转化到E[j-i][k]*(j-I+1)/(k+1)

前两项加起来再加上当前次数1就可以得到一种策略下的期望

然后对于所有的i跑一遍，取最小值，就可以得到最后的E[k][w]在最优策略下的值

此时dp的复杂度是O(n^3)

但是可以想到当w很大的时候其实二分就可以了因此其实都不用用到w次错误

log2(2000)<12因此其实w给到12就可以了。剩下的更大的w都没有意义了。

#include <iostream>
#include <bits/stdc++.h>
using namespace std;

const int M=2005;
double dp[M][15];

void init()
{
    int sum=1;
    for(int i=1;i<M;i++)
    {
        sum+=i+1;
        dp[i][1]=(double)(sum-1)/(i+1);
        //if(i<20)
        //cout<<dp[i][1]<<endl;
    }
    for(int i=1;i<M;i++)
    {
        for(int j=2;j<15;j++)
        {
            dp[i][j]=1e18;
            for(int k=1;k<=i;k++)
                dp[i][j]=min(dp[i][j],dp[i-k][j]*(i-k+1)/(i+1)+dp[k-1][j-1]*k/(i+1)+1);
        }
    }
    //for(int i=1;i<20;i++)
    //    cout<<dp[i][1]<<endl;
}

int main()
{
    int k,w;
    init();
    while(scanf("%d%d",&k,&w)!=EOF)
    {
        w=min(12,w);
        printf("%.6lf\n",dp[k][w]);
    }
    return 0;
}