赛后补题:2022CCPC绵阳 A. Ban or Pick, What‘s the Trick

传送门:CF

题目描述:

outputstandard output
Bobo has recently learned how to play Dota2. In Dota2 competitions, the mechanism of banning/picking 
heroes is introduced, modified and simplified as follows for the sake of the problem:
Suppose a game is played between two teams: Team A and Team B. Each team has a hero pool of n 
heroes with positive utility scores a1,…,an and b1,…,bn, respectively. Here we assume all heroes in two 
teams' hero pool are distinct.
The two teams then perform ban/pick operations alternately, with Team A going first. In one team's turn, it 
can either pick a hero for itself, or ban an unselected hero from the opponent's hero pool.
After 2n turns, all heroes are either picked or banned. Each team then needs to choose at most k heroes 
from all heroes it picked to form a warband and the score for the warband is calculated as the sum of 
utility scores over all heroes in it.
Let sA,sB be the score of the warband formed by Team A and Team B, respectively. Team A wants to 
maximize the value of sA−sB while Team B wants to minimize it.
Bobo wants to know, what should be the final value of sA−sB, if both teams act optimally? He's not really 
good at calculating this, so he turned to you for help.
输入:
2 1
3 6
2 4
输出:
2

一道博弈论dp的题目,比赛的时候推出了dp方程,转移也想出来了,但是不知道博弈论dp正推会有后效性,需要逆推,所以VP的时候打出来正推的dp,然后就…,感觉主要问题在于之前我对于博弈论并不是很重视,以至于这方面的题目基本没打过,是空白的,所以造成了连博弈论dp都没看出来这个大问题,赛后赶紧补几道

对于这道题,我们观察一下题目,会发现 n n n的范围超过了1e5,但是 k k k的范围只有10.所以我们的解法肯定是和 k k k有关系的,然后赛场上我想到了dp.dp应该也不难想,大概就是 d p [ i ] [ j ] [ j 2 ] dp[i][j][j2] dp[i][j][j2]记录前 i i i轮操作(每轮操作可以进行禁和选)A方选择了 j j j个英雄,B方选择了 j 2 j2 j2个英雄的最优策略.但是打完会发现这样是不行的,因为我们之前选择的方式会影响后面的决策,所以正确的写法是倒推

我们将我们的dp方程改一下,将 d p [ i ] [ j ] [ j 2 ] dp[i][j][j2] dp[i][j][j2]改成经过前 i i i轮A方选择了 j j j个英雄,B方选择了 j 2 j2 j2个英雄之后 2 ∗ n − i 2*n-i 2ni轮所有情况的最优策略,此时我们将dp方程的影响范围从前面改到了后面.这样的话,我们就会发现现在后面的策略选择并不会影响前面的了,此时就符合了dp的无后效性.那么转移方程也就不难写出了

对于当前 d p [ i ] [ n u m a ] [ n u m b ] dp[i][numa][numb] dp[i][numa][numb]来说,如果当前是A进行选择,有两种情况:
1.A选择英雄,那么此时 d p [ i ] [ n u m a ] [ n u m b ] = d p [ i + 1 ] [ n u m a + 1 ] [ n u m b ] + a [ n o w a ] dp[i][numa][numb]=dp[i+1][numa+1][numb]+a[nowa] dp[i][numa][numb]=dp[i+1][numa+1][numb]+a[nowa]
其中我们的 n o w a nowa nowa是当前A所有英雄中除了被禁和选之外最大的数字,nowa不难求出
2.A选择禁英雄.那么此时 d p [ i ] [ n u m a ] [ n u m b ] = d p [ i + 1 ] [ n u m a ] [ n u m b ] dp[i][numa][numb]=dp[i+1][numa][numb] dp[i][numa][numb]=dp[i+1][numa][numb]
因为A的策略是尽量大,所以此时我们两者取max即可

如果当前是B进行选择,也有两种情况:
1.B选择英雄,那么此时 d p [ i ] [ n u m a ] [ n u m b ] = d p [ i + 1 ] [ n u m a ] [ n u m b + 1 ] − b [ n o w b ] dp[i][numa][numb]=dp[i+1][numa][numb+1]-b[nowb] dp[i][numa][numb]=dp[i+1][numa][numb+1]b[nowb]
其中 n o w b nowb nowb的定义和 n o w a nowa nowa相同
2.B选择禁英雄,那么此时 d p [ i ] [ n u m a ] [ n u m b ] = d p [ i + 1 ] [ n u m a ] [ n u m b ] dp[i][numa][numb]=dp[i+1][numa][numb] dp[i][numa][numb]=dp[i+1][numa][numb]
因为B的策略是尽量小,所以此时我们两者取min即可

对于这种倒推形dp,记忆化搜索往往比直接dp来的舒服


下面是具体的代码部分:

#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
#define root 1,n,1
#define lson l,mid,rt<<1
#define rson mid+1,r,rt<<1|1
inline ll read() {
	ll x=0,w=1;char ch=getchar();
	for(;ch>'9'||ch<'0';ch=getchar()) if(ch=='-') w=-1;
	for(;ch>='0'&&ch<='9';ch=getchar()) x=x*10+ch-'0';
	return x*w;
}
#define maxn 1000000
const double eps=1e-8;
#define	int_INF 0x3f3f3f3f
#define ll_INF 0x3f3f3f3f3f3f3f3f
int n,k;
int a[maxn];int b[maxn];
bool cmp(int aa,int bb) {
	return aa>bb;
}
int dp[200010][12][12];int vis[200010][12][12];
int solve(int step,int numa,int numb) {
	if(step>2*n) return 0;
	if(vis[step][numa][numb]) return dp[step][numa][numb];
	vis[step][numa][numb]=1;
	if(step&1) {
		int nowa=((step-1)/2-numb)+numa+1;
		dp[step][numa][numb]=solve(step+1,numa,numb);
		if(nowa<=n&&numa<k) 
			dp[step][numa][numb]=max(dp[step][numa][numb],solve(step+1,numa+1,numb)+a[nowa]);
		return dp[step][numa][numb];
	}else {
		int nowb=(step/2-numa)+numb+1;
		dp[step][numa][numb]=solve(step+1,numa,numb);
		if(nowb<=n&&numb<k) 
			dp[step][numa][numb]=min(dp[step][numa][numb],solve(step+1,numa,numb+1)-b[nowb]);
		return dp[step][numa][numb];
	}
}
int main() {
	n=read();k=read();
	for(int i=1;i<=n;i++) a[i]=read();
	for(int i=1;i<=n;i++) b[i]=read();
	sort(a+1,a+n+1,cmp);
	sort(b+1,b+n+1,cmp);
	solve(1,0,0);
	cout<<dp[1][0][0]<<endl;
	return 0;
}
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值