赛后补题:2022CCPC绵阳 A. Ban or Pick, What‘s the Trick

最新推荐文章于 2023-11-06 23:46:04 发布

yingjiayu12

最新推荐文章于 2023-11-06 23:46:04 发布

阅读量655

点赞数

分类专栏： c++算法文章标签：动态规划算法

本文链接：https://blog.csdn.net/yingjiayu12/article/details/128667504

版权

c++算法专栏收录该内容

423 篇文章 0 订阅

订阅专栏

传送门:CF

题目描述:

outputstandard output
Bobo has recently learned how to play Dota2. In Dota2 competitions, the mechanism of banning/picking 
heroes is introduced, modified and simplified as follows for the sake of the problem:
Suppose a game is played between two teams: Team A and Team B. Each team has a hero pool of n 
heroes with positive utility scores a1,…,an and b1,…,bn, respectively. Here we assume all heroes in two 
teams' hero pool are distinct.
The two teams then perform ban/pick operations alternately, with Team A going first. In one team's turn, it 
can either pick a hero for itself, or ban an unselected hero from the opponent's hero pool.
After 2n turns, all heroes are either picked or banned. Each team then needs to choose at most k heroes 
from all heroes it picked to form a warband and the score for the warband is calculated as the sum of 
utility scores over all heroes in it.
Let sA,sB be the score of the warband formed by Team A and Team B, respectively. Team A wants to 
maximize the value of sA−sB while Team B wants to minimize it.
Bobo wants to know, what should be the final value of sA−sB, if both teams act optimally? He's not really 
good at calculating this, so he turned to you for help.
输入:
2 1
3 6
2 4
输出:
2

一道博弈论dp的题目,比赛的时候推出了dp方程,转移也想出来了,但是不知道博弈论dp正推会有后效性,需要逆推,所以VP的时候打出来正推的dp,然后就…,感觉主要问题在于之前我对于博弈论并不是很重视,以至于这方面的题目基本没打过,是空白的,所以造成了连博弈论dp都没看出来这个大问题,赛后赶紧补几道

对于这道题,我们观察一下题目,会发现 $n$ 的范围超过了1e5,但是 $k$ 的范围只有10.所以我们的解法肯定是和 $k$ 有关系的,然后赛场上我想到了dp.dp应该也不难想,大概就是 $d p [i] [j] [j 2]$ 记录前 $i$ 轮操作(每轮操作可以进行禁和选)A方选择了 $j$ 个英雄,B方选择了 $j 2$ 个英雄的最优策略.但是打完会发现这样是不行的,因为我们之前选择的方式会影响后面的决策,所以正确的写法是倒推

我们将我们的dp方程改一下,将 $d p [i] [j] [j 2]$ 改成经过前 $i$ 轮A方选择了 $j$ 个英雄,B方选择了 $j 2$ 个英雄之后 $2 * n - i$ 轮所有情况的最优策略,此时我们将dp方程的影响范围从前面改到了后面.这样的话,我们就会发现现在后面的策略选择并不会影响前面的了,此时就符合了dp的无后效性.那么转移方程也就不难写出了

对于当前 $d p [i] [n u ma] [n u mb]$ 来说,如果当前是A进行选择,有两种情况:
1.A选择英雄,那么此时 $d p [i] [n u ma] [n u mb] = d p [i + 1] [n u ma + 1] [n u mb] + a [n o w a]$
其中我们的 $n o w a$ 是当前A所有英雄中除了被禁和选之外最大的数字,nowa不难求出
2.A选择禁英雄.那么此时 $d p [i] [n u ma] [n u mb] = d p [i + 1] [n u ma] [n u mb]$
因为A的策略是尽量大,所以此时我们两者取max即可

如果当前是B进行选择,也有两种情况:
1.B选择英雄,那么此时 $d p [i] [n u ma] [n u mb] = d p [i + 1] [n u ma] [n u mb + 1] - b [n o w b]$
其中 $n o w b$ 的定义和 $n o w a$ 相同
2.B选择禁英雄,那么此时 $d p [i] [n u ma] [n u mb] = d p [i + 1] [n u ma] [n u mb]$
因为B的策略是尽量小,所以此时我们两者取min即可

对于这种倒推形dp,记忆化搜索往往比直接dp来的舒服

下面是具体的代码部分:

#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
#define root 1,n,1
#define lson l,mid,rt<<1
#define rson mid+1,r,rt<<1|1
inline ll read() {
	ll x=0,w=1;char ch=getchar();
	for(;ch>'9'||ch<'0';ch=getchar()) if(ch=='-') w=-1;
	for(;ch>='0'&&ch<='9';ch=getchar()) x=x*10+ch-'0';
	return x*w;
}
#define maxn 1000000
const double eps=1e-8;
#define	int_INF 0x3f3f3f3f
#define ll_INF 0x3f3f3f3f3f3f3f3f
int n,k;
int a[maxn];int b[maxn];
bool cmp(int aa,int bb) {
	return aa>bb;
}
int dp[200010][12][12];int vis[200010][12][12];
int solve(int step,int numa,int numb) {
	if(step>2*n) return 0;
	if(vis[step][numa][numb]) return dp[step][numa][numb];
	vis[step][numa][numb]=1;
	if(step&1) {
		int nowa=((step-1)/2-numb)+numa+1;
		dp[step][numa][numb]=solve(step+1,numa,numb);
		if(nowa<=n&&numa<k) 
			dp[step][numa][numb]=max(dp[step][numa][numb],solve(step+1,numa+1,numb)+a[nowa]);
		return dp[step][numa][numb];
	}else {
		int nowb=(step/2-numa)+numb+1;
		dp[step][numa][numb]=solve(step+1,numa,numb);
		if(nowb<=n&&numb<k) 
			dp[step][numa][numb]=min(dp[step][numa][numb],solve(step+1,numa,numb+1)-b[nowb]);
		return dp[step][numa][numb];
	}
}
int main() {
	n=read();k=read();
	for(int i=1;i<=n;i++) a[i]=read();
	for(int i=1;i<=n;i++) b[i]=read();
	sort(a+1,a+n+1,cmp);
	sort(b+1,b+n+1,cmp);
	solve(1,0,0);
	cout<<dp[1][0][0]<<endl;
	return 0;
}