Artificial Intelligence Planing (AIP) PDDL 人工智能规划备考

最新推荐文章于 2024-07-05 01:12:23 发布

微信公众号[机器学习炼丹术]

最新推荐文章于 2024-07-05 01:12:23 发布

阅读量3.3k

点赞数 4

分类专栏：课程笔记文章标签：编程语言算法

本文链接：https://blog.csdn.net/qq_34107425/article/details/103884037

版权

课程笔记专栏收录该内容

34 篇文章 6 订阅

订阅专栏

文章目录

1 classical planning
2 the big three planning approaches
3 heuristic 5 families
4 temperal planning
5 heuristic guidance（211）
6 preferences

1 classical planning

kinds of planning

online
discrete
deterministic
single-agent
game-theoretic planning
fully observable
Sequential

Domain-Independence Automated Planning=GPS

GPS = general problem solving
create one planning algorithm that preforms sufficiently well on many application domain.

plan validation

use VAL plan to independently verify if the plan is correct.

什么是plan

在这里插入图片描述

plan的最优解和次优解

因为有的planing找不到最优解，所以产生了Optimal Classical Planning和Satisficing laical Planning的区别：
在这里插入图片描述

1.1 planner的更新换代

transition systems

$S,s_0,S_*,A,cost,T>$
$< s t a t e s, i n i t a l, g o a l, a c t i o n s, c o s t, t r a n s i t i o n r e l a t i o n >$
在这里插入图片描述
很快，在处理问题的时候发现transition system不好！哪里不好呢？就是定义S为state不好，数量太大了。

所以下一代为STRIPS：

STRIPS

$< V, I, G, A >$
$< s t a t e v a r i a b l e, i n i t i a l, g o a l, a c t i o n s >$

在这里插入图片描述

这个好啊，只用variable来表示，就可以表示2的variable次方的state了，假设variable都是布尔变量。

SAS+

这个允许使用non-binary的state variables
在这里插入图片描述

PDDL

结合了STRIPS 和ADL
ADL ：Abstract Description Language。是一些逻辑谓词，first-order logic。
PDDL describes the world in a schematic way. This makes the encoding much smaller and easy to write.
grounding:Planner translate the schematic input into STRIPS in a pre-process.

一下应该都是ADL的内容

disjunctions:
conjunctions:
condition
negated

PDDL2.1 Hierachy（5 levels）

STRIPS/ADL：conjunction，disjunction，negated，condition
numeric：+，-，/，*，increase，decrease
durative action：at start, at end, over all
continuous change
processes and event

根据numeric和durative action，我们可以使用更加复杂的衡量函数metric
在这里插入图片描述

PDDL2.2

derived predicates
timed initial leterals

PDDL3

hard comstraints
preference
state trajectory constraints

PDDL+

temporal numeric change
processes
events

2 the big three planning approaches

2.1 Graph/SAT planning

planning graph

odd layers is “state levels”
even layers is “action levels”
$S_i$ , $A_i$ 每次只能执行一个action，但是全部的possible action都是要画出来的。

画图的时候，同一层同一元素可以使用多次，然后不产生影响。仅仅对同一层而言。
现在我们来执行mutex

mutex(新知识点)

mutually exclusive
interference: 两个动作 effect ---- precondition
Competing need：两个动作 precondition----precondition
inconsistent support: all ways of creating them are mutex

PG algorithm

grow PG until all goals are reachable and none are pair wise mutex.
if can’t search PG for a valid plan, add a level
on one level, complete all goals, backward chain
选择可以实现目标的non-mutex的subset actions，把他们的precondition作为前一层的goals。

SAT planing

在这里插入图片描述
看不懂，76，77，78

2.2 symbolic search planing

BDDS (不懂87-90)

binary decision diagrams
exponentially large state sets -> polynomially sized BDDs

ordering

ordering不同会造成状态复杂度的不同
在这里插入图片描述

reduction

在这里插入图片描述

###Symbolic planning
在这里插入图片描述

2.3 Heuristic State-Space Search

search space

state-space 都是来回往复的directed graph，所以如何避免go round in circles

forward search

forwards search from the initial state, with a closed list, builds a tree, only one path into each node, no backwards edges.
closed list:states already dealt with
Open list:states to deal with
breadth-first & depth-first

Heuristic search planning

注意h，g的含义
h越靠近目标越小，g距离初始点越远越大

在这里插入图片描述

properties of heuristics

admissibility: 从s开始的所有目标达成计划的成本下限是h(s)。不会过高的估计到达目标的代价
Consistency:
additivity

DIJKSTRA and A*（百度）

3 heuristic 5 families

3.1 delete relaxation

在这里插入图片描述

3.1.1 RPG

RPG:Relaxed planning graph
在实现目标之后，开始倒推，goal的achiever的precondition加入前面一层的goal，这样就可以知道每一层的goal，这样就是一种智能搜索。

3.2 abstraction

在这里插入图片描述

abstraction heuristic

是最小的cost从 $\alpha(s)$ 到目标抽象集合 $S^\alpha$ 。 $\alpha$ 是abstraction function.
在这里插入图片描述

abstraction projection

在这里插入图片描述

pattern database PDB

pattern就是上图中的package，truckA，我们只考虑的变量
每一个abstract state 到 abstract goal state的最优距离会被存储到database。
PDB expensive in time and memory所以我们要选择好的pattern

additive PDB

在这里插入图片描述

SCP（不懂）

cost partition 是把一个cost function分割成多个，使得每一个自function相加的和不大于原cost function
zero-one是一种特殊的partition，其中<c1,c2,…,cn>每一个function对于一个特定的action返回原cost function的值，对其他action返回0
Saturated Cost Partition

3.3 critical paths（不是重点）

在这里插入图片描述

3.4 landmarks

被用来做cost estimate
在这里插入图片描述

optimal cost partition(LP)（不考）

linear program

3.5 network flows（不是重点）

在这里插入图片描述

4 temperal planning

at start; at end; over all

4.1 snap action

如果在一个durative action执行的过程中，goal达成了。怎么办？
如果在执行的过程中，over all条件被打破了怎么办
回答：使用imply
temperally sound and logical sound
logical sound 是指不考虑时间的情况下你得出的plan在逻辑上是行得通的，temporal sound是指还要考虑时间因素（比如deadline）