CS188 Games

最新推荐文章于 2024-09-13 14:20:18 发布

L.A雨夜充盈的寒冷

最新推荐文章于 2024-09-13 14:20:18 发布

阅读量1.1k

点赞数 34

文章标签：人工智能

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_48885685/article/details/139159883

版权

Also known as Adversarial Search Problems

Deterministic zero-sum games

Components

Initial state
Players
Actions
Transition model
Terminal test
Terminal values(Utility)
State Value

State value is the best possible outcome (utility) an agent can achieve from that state

$\forall$ non-terminal states, $max_{s\in successors(s)}V_s$
$\forall$ terminal states, $V (S) = kn o w n$

Minimax

$\forall$ agent-controlled states, $max_{s\in successors(s)}V_s$
$\forall$ opponent-controlled states, $min_{s\in successors(s)}V_s$
$\forall$ terminal states, $V (S) = kn o w n$

In implementation, minimax behaves similarly to postorder traversal depth-first search.

Alpha-Beta Pruning

$x$ is the terminal value need to be looked up
$\beta \ge x \ge \alpha$

Still need to reach the bottom.

Evaluation Functions

Evaluation function takes a state, and output the evaluated state value.

Most common form:

$\sum_i w_if_i(s)$

weights: $w_i$ ,
feature: $f_i$

Expectimax

Chance nodes:

Instead of considering the worst case as minimizer nodes do, considers the average case.

$\forall$ agent-controlled states, $max_{s'\in successors(s)}V_{s'}$
$\forall$ chance states, $min_{s'\in successors(s)} P(s'|s)V_{s'}$
$\forall$ terminal states, $V (S) = kn o w n$

Mixed Layers Types

Many Players

Monte Carlo Tree Search

For games with a large branching factor

Evaluation by rollouts
Selective search

UCB Algorithm:
$UCB_1(t) = \frac{U(n)}{N(n)} + C \times \sqrt{\frac{logN(parent(n))}{N(n)}}$

When $N (n)$ to $+\infin$ , MCTS approaches Minimax.

General Games

Multi-agent utilities: using a tuple to represent the different utility values.

请添加图片描述

Summary

Minimax: when opponents behave optimally
Expectimax: when opponents behave sub-optimally
Monte Carlo Tree Search: when a large branching factor
General game: using tuples.

L.A雨夜充盈的寒冷

关注

34
点赞
踩
28

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。