CS 188 (4) Uniform Cost Search( 统一代价搜索算法)

    本文要实现 Uniform Cost Search( 统一代价搜索算法) ,首先搜索总成本最小的节点,  统一代价搜索算法搜索到达目标。

    PriorityQueue实现一个优先级队列的数据结构,每个插入元素具有与之关联的优先级,client快速检索队列中优先级最低的元素,以 O(1) 时间复杂度访问最低优先级的元素。   

class PriorityQueue:
    """
      Implements a priority queue data structure. Each inserted item
      has a priority associated with it and the client is usually interested
      in quick retrieval of the lowest-priority item in the queue. This
      data structure allows O(1) access to the lowest-priority item.
    """
    def  __init__(self):
        self.heap = []
        self.count = 0


    def push(self, item, priority):
        entry = (priority, self.count, item)
        heapq.heappush(self.heap, entry)
        self.count += 1


    def pop(self):
        (_, _, item) = heapq.heappop(self.heap)
        return item


    def isEmpty(self):
        return len(self.heap) == 0


    def update(self, item, priority):
        # If item already in priority queue with higher priority, update its priority and rebuild the heap.
        # If item already in priority queue with equal or lower priority, do nothing.
        # If item not in priority queue, do the same thing as self.push.
        for index, (p, c, i) in enumerate(self.heap):
            if i == item:
                if p <= priority:
                    break
                del self.heap[index]
                self.heap.append((priority, c, item))
                heapq.heapify(self.heap)
                break
        else:
            self.push(item, priority)


   Heap queue(heapq 堆队列)中堆是数组,对于所有k,a[k]<=a[2*k+1]和a[k]<=a[2*k+2],从0开始计算元素。为了比较,不存在的元素被认为是无限的,堆a[0]始终是其最小元素。

     Heap queue 堆队列的用法:

heap = []            # 新建一个空堆
heappush(heap, item) # 将一个新的元素压入堆
item = heappop(heap) # 从堆中取出最小的元素
item = heap[0]       #最小是元素是heap【0】,直接获取
heapify(x)           #将列表按线性时间转换为堆
item = heapreplace(heap, item) # 弹出并返回最小的元素,并添加新的元素;堆大小不变

统一代价搜索算法代码:

# search.py
# ---------
# Licensing Information:  You are free to use or extend these projects for
# educational purposes provided that (1) you do not distribute or publish
# solutions, (2) you retain this notice, and (3) you provide clear
# attribution to UC Berkeley, including a link to http://ai.berkeley.edu.
# 
# Attribution Information: The Pacman AI projects were developed at UC Berkeley.
# The core projects and autograders were primarily created by John DeNero
# (denero@cs.berkeley.edu) and Dan Klein (klein@cs.berkeley.edu).
# Student side autograding was added by Brad Miller, Nick Hay, and
# Pieter Abbeel (pabbeel@cs.berkeley.edu).
def uniformCostSearch(problem):
    """Search the node of least total cost first."""
    "*** YOUR CODE HERE ***"
    #util.raiseNotDefined()
    path =Path([problem.getStartState()],[],0)
    if problem.isGoalState(problem.getStartState()):
        return path.directions
    
    #创建优先队列
    queue = util.PriorityQueue()
    # push的第一个参数path是状态项Path,属性(位置、方向、成本)
    #push的第二个参数0是优先级
    #在heapq.heappush中将(优先级priority, 计数索引self.count, 状态项item)三元组
    #根据优先级priority, 计数索引self.count的组合优先级
    #即(优先级priority如一样,按计数索引判断优先级),将状态项item压入队列。
    queue.push(path,0) 
    visited =[problem.getStartState()]
    
    #和广度优先搜索的区别,仅在于queue.push的不同。
    while not queue.isEmpty():
        #如队列不为空,取最先进入队列的元素(List的最后一个元素),获取当前路径
        currentPath = queue.pop()
        currentLocation = currentPath.locations[-1]
         #如果当前位置已经是终点的位置,则返回当前路径的方向列表,用于移动pac man。
        if problem.isGoalState(currentLocation):
            return currentPath.directions
        else:
            #在搜索问题中取得当前位置后继的下一个状态.getSuccessors中for循环遍历北、南、东、西四个方向,
            #directionToVector取得方向到坐标偏移向量的转换值,在当前坐标上加上位移的坐标偏移量值,
            #如果下一步坐标移动的点不是围墙,则在后续状态列表中加入三元组( nextState, action, cost)
            nextSteps = problem.getSuccessors(currentLocation)
            for nextStep in nextSteps:
                #遍历下一步的状态,依次获得位置、方向、成本信息
                nextLocation =nextStep[0]
                nextDirection = nextStep[1]
                nextCost = nextStep[2]
                # 不在当前路径里面而且下一个位置还没被访问(多条路径交叉点)
                if (nextLocation not in currentPath.locations) and (nextLocation not in visited):
                    if not problem.isGoalState(nextLocation):
                        visited.append(nextLocation)
                        print("访问的位置:", visited)
                    #获取当前路径列表集
                    nextLocations =currentPath.locations[:]
                    #将新的位置加入到当前路径的列表里面
                    nextLocations.append(nextLocation)
                    print("当前位置:",currentLocation)
                    print("当前位置下一步可能的移动位置:",nextLocation)
                    print("加到当前位置列表集:",nextLocations)
                    #print()
                    #print()
                    #print(currentLocation,nextLocation,nextLocations)
                    #获取当前的方向集
                    nextDirections = currentPath.directions[:]
                    #将新的方向加入到当前方向集的列表里面
                    nextDirections.append(nextDirection)
                    nextCosts = currentPath.cost +nextCost
                    nextPath =Path(nextLocations,nextDirections,nextCosts)
                    #下一步的状态,入优先级别的队列
                    # push的第一个参数nextPath是状态项Path,属性(位置、方向、成本)
                    #push的第二个参数nextCosts是优先级
                    #在heapq.heappush中将(优先级priority, 计数索引self.count, 状态项item)三元组
                    #根据优先级priority, 计数索引self.count的组合优先级
                    #即(优先级priority如一样,按计数索引判断优先级),将状态项item压入队列。
                    print("优先级:",nextCosts)
                    print()
                    print()
                    queue.push(nextPath, nextCosts)


    #队列为空,仍未到达终点,返回空集
    return []
  

虽然BFS会找到一条通向目标的最少行动路径,但我们可能希望找到其他意义上“最好”的路径。考虑一下mediumDottedMaze 迷宫和mediumScaryMaze迷宫。通过改变成本函数,我们可以鼓励Pacman找到不同的路径。例如,我们可以对在幽灵出入地区的危险步骤收取更高的费用,或者对食物丰富地区的步骤收取更少的费用,而一个理性的Pacman代理应该调整它的行为来做出反应。在search.py中的uniformCostSearch函数中实现了统一成本图搜索算法。可查看util.py,了解一些在实现中可能有用的数据结构。

        观察以下三种布局中的成功行为,其中下面的代理使用不同的成本函数(代理和成本函数已编写),StayEastSearchAgent的成本函数是costFn = lambda pos: .5 ** pos[0];StayWestSearchAgent的成本函数是costFn = lambda pos: 2 ** pos[0],StayEastSearchAgent and StayWestSearchAgen的路径成本应该分别非常低和非常高。

python pacman.py -l mediumMaze -p SearchAgent -a fn=ucs
python pacman.py -l mediumDottedMaze -p StayEastSearchAgent
python pacman.py -l mediumScaryMaze -p StayWestSearchAgent

运行结果分别如下:

E:\2019reforce_learning\cs188\proj1-search-python3>python pacman.py -l mediumMaze -p SearchAgent -a fn=ucs
[SearchAgent] using function ucs
[SearchAgent] using problem type PositionSearchProblem
Path found with total cost of 68 in 0.3 seconds
Search nodes expanded: 269
Pacman emerges victorious! Score: 442
Average Score: 442.0
Scores:        442.0
Win Rate:      1/1 (1.00)
Record:        Win


E:\2019reforce_learning\cs188\proj1-search-python3>python pacman.py -l mediumDottedMaze -p StayEastSearchAgent
Path found with total cost of 1 in 0.2 seconds
Search nodes expanded: 186
Pacman emerges victorious! Score: 646
Average Score: 646.0
Scores:        646.0
Win Rate:      1/1 (1.00)
Record:        Win


E:\2019reforce_learning\cs188\proj1-search-python3>python pacman.py -l mediumScaryMaze -p StayWestSearchAgent
Path found with total cost of 68719479864 in 0.2 seconds
Search nodes expanded: 108
Pacman emerges victorious! Score: 418
Average Score: 418.0
Scores:        418.0
Win Rate:      1/1 (1.00)
Record:        Win

如使用mediumMaze 布局:

E:\2019reforce_learning\cs188\proj1-search-python3>python pacman.py -l mediumMaze -p SearchAgent -a fn=ucs
[SearchAgent] using function ucs
[SearchAgent] using problem type PositionSearchProblem
Path found with total cost of 68 in 0.3 seconds
Search nodes expanded: 269
Pacman emerges victorious! Score: 442
Average Score: 442.0
Scores:        442.0
Win Rate:      1/1 (1.00)
Record:        Win


E:\2019reforce_learning\cs188\proj1-search-python3>python pacman.py -l mediumMaze -p StayEastSearchAgent
Path found with total cost of 1 in 0.2 seconds
Search nodes expanded: 260
Pacman emerges victorious! Score: 436
Average Score: 436.0
Scores:        436.0
Win Rate:      1/1 (1.00)
Record:        Win


E:\2019reforce_learning\cs188\proj1-search-python3>python pacman.py -l mediumMaze -p StayWestSearchAgent
Path found with total cost of 17183280440 in 0.2 seconds
Search nodes expanded: 173
Pacman emerges victorious! Score: 358
Average Score: 358.0
Scores:        358.0
Win Rate:      1/1 (1.00)
Record:        Win


欢迎关注微信公众号:“从零起步学习人工智能”。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

段智华

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值