time limit per test 2 seconds
memory limit per test 256 megabytes
input standard input
output standard output
You received as a gift a very clever robot walking on a rectangular board. Unfortunately, you understood that it is broken and behaves rather strangely (randomly). The board consists of N rows and Mcolumns of cells. The robot is initially at some cell on the i-th row and the j-th column. Then at every step the robot could go to some another cell. The aim is to go to the bottommost (N-th) row. The robot can stay at it's current cell, move to the left, move to the right, or move to the cell below the current. If the robot is in the leftmost column it cannot move to the left, and if it is in the rightmost column it cannot move to the right. At every step all possible moves are equally probable. Return the expected number of step to reach the bottommost row.
Input
On the first line you will be given two space separated integers N andM (1 ≤ N, M ≤ 1000). On the second line you will be given another two space separated integers i and j (1 ≤ i ≤ N, 1 ≤ j ≤ M) — the number of the initial row and the number of the initial column. Note that, (1, 1) is the upper left corner of the board and (N, M) is the bottom right corner.
Output
Output the expected number of steps on a line of itself with at least 4digits after the decimal point.
input
10 10
10 4
output
0.0000000000
input
10 14
5 14
output
18.0038068653
这是一道概率DP,作为一名概率初学者,没有搜到关于这道概率DP相关的解释,于是我想发表一下我的见解。
题意—机器人从x,y处走到最下面一行,求步数的期望,机器人可以随机向左右下方走动,但不能过左右边界,也可能停在原地,这几种可能概率相等。
首先概率DP题目,我在kuanbin的博客上看到一个结论,求概率正推,求期望逆推,这一题是求期望,所以需要逆推。
那么建一个mapp的二维数组,里面保存从i行j列到最后一行的期望。然后从最后一行慢慢往上算。到达x行停止。
这里面有一个大坑点,停在原地也算一步(=_=),个人理解“步”,有点偏差。
推导公式,因为确定位置的期望是一定的,设mapp[i][j]处期望为a.它由周围的mapp值决定。以中间某个一般值解释(代码else里面情况)
a=mapp[i+1][j]*1/4+mapp[i][j+1]*1/4+mapp[i][j-1]*1/4+a*1/4+1;
变形得a=(mapp[i+1][j]*1/4+mapp[i][j+1]*1/4+mapp[i][j-1]*1/4+1)/(1-1/4);
比较难懂的就是t循环是什么,笼统的说是减小误差(t=1开始时误差非常大);
t的作用是体现在一整行上面,因为一行的某个值左右没有确定自己也不能确定,比如t=1时,求i行1列的期望其实只求了向下走一步这种可能,求i行2列期望时由于1列已经求好
了,2列的期望保存的是向下和左转向下2种可能,比第一列精确了。
然后循环t=2,更新此行1列的值,保存的就是向下,向右向下,向右向左向下三中可能。以此类推.........机器人在同一行回荡步数越多概率越小,回荡100步可能几乎为0,t=100就十分精确了。此时此行各列保存了,机器人以各种方式走到下一行的概率和。这个在定义上就是期望了。
解释的很累,我做为初学概率DP的做题时也很难理解。
#include<iostream>
#include<cstdio>
using namespace std;
double mapp[1005][1005];
int n,m,x,y;
int main()
{
cin>>n>>m>>x>>y;
for(int i=n-1;i>=x;i--)
{
for(int t=1;t<=100;t++)
{
for(int j=1;j<=m;j++)
{
if(m==1)
mapp[i][j]=(mapp[i+1][j]*1/2.0+1)*2;
else
{
if(j==1)
mapp[i][j]=(mapp[i+1][j]*1/3.0+mapp[i][j+1]*1/3.0+1)/(1-1/3.0);
else if(j==m)
mapp[i][j]=(mapp[i+1][j]*1/3.0+mapp[i][j-1]*1/3.0+1)/(1-1/3.0);
else
mapp[i][j]=(mapp[i+1][j]*1/4.0+mapp[i][j+1]*1/4.0+mapp[i][j-1]*1/4.0+1)/(1-1/4.0);
}
}
}
}
printf("%.6f",mapp[x][y]);
return 0;
}