Artificial Intelligence - Chapter 2 Search Problems

Seaching problems could be implemented by problem-solving agent, it is a subset of goal-based agents. The goal here is a certain state. So these agents will want to get to a goal state through a combination of actions. To do that they're going to search many state spaces it can get to.

Note that in this section, we would treat states as black boxes. We're just going to assume that the problem has a state defined, we treat it as an unit and don't really care what's inside. The only thing we would be worried about is how to go from state to state.

Actions: In different states the agent may have different set of actions, for example Action(s1) = {a1, a2, a3}, Action(s2) = {a2, a4}, but the choice are finite since we define the task environment as discrete in previous slide.

Transition model: In the example, if the agent is in state s1 and it is going to take action a1, then the next state of the agent is s2. Note that it is the deterministic case! If we have a stochastic transition model, then in s1 and take a1, the agent have the probability of like 80% to go to s2, 20% to go to s3. It would be more complicated and we don't consider that in this case.

In search trees, the same state could occur several times. For example, if we in the root state and go 'Up', we're in a new state, but when we choose to go 'Down', we're then in a state which is completly same as the root state, we would say that a state recurs now. We know that it would be silly if we take 'Up' then 'Down' then 'Up' then 'Down' in each step, but it is valid in the search tree, so to avoid this infinite loop, we would have some algorithm to detect if this kind of loop happens.

This is why we choose Search trees rather than graphs: Trees allow us to keep track of the search path!

Expand: in a current node, expand means we take all possible actions available in that node (recall that we have an action set for each state/ node), and see what's the new state it take us to.

In the case above, we just expand Sibiu, so all its successors are now in frontier, we mark them as white background, now that Arad and Sibiu are nodes we have explored, we mark them in gray background, we could also see the green node (it is still unknown for the agent), they are nodes we have not explored yet. Nodes in the frontier would be the candidates for expansion. So we could conclude that the frontier is a sort of boundary between the parts of the tree that we have explored already and the parts of the tree we have yet to explore. We actually don't care about what's not explored because the agent really knows nothing about it, the main focus would be what's in the frontier.

Now the problem comes, how could we decide which node in the frontier should be expand first? What priority we should define in the frontier? 

Different rules of priority will form different implementation of the search process. Let's see the pesudo-code for search first:

 

The best-first Search is the general implementation, we're going to keep track of the following qualities:

The current node:we start with the initial node (corresponding to the initial state), then the current node is going to change since we propagate outward (the expanded circle grows).

The frontier: here is described as a priority queue ordered by some function F. We will talk about what F looks like in different implementaions. The priority queue will allow us not to worry about picking among many different nodes. There might be ties (55开的两个node) but the priority queue will always know which node is better to pick for expansion.

The reached table: it's going to keep track of all the nodes we've seen already. This will hopefully help us avoid the infinite loop situation mentioned before. In python, the table is implemented by a dictionary where the key is a state and the value is a node {state1: node1, state2: node2 ...} (node would be a structure!)Also note that: Reached = Frontied + Expanded.

OK, so what is the priority queue we will actually using then? Let's see the first type:

Why negative of depth? It is based on the assumption that the priority queue pops the lowest value node first. So the deeper the node, the eariler it gets poped from frontier.

DFS does not consider true cost, only cares about node depths!

Possible optimizations:

Early goal test:

             As soon as we expand a child node, we check immediately whether it's the goal, and if it is, we don't even bother putting the node into the frontier, we just return the solution at that point.  

  • Why we could use early goal test in DFS?

             Because DFS is not actually an optimal search algorithm because it dose not consider true costs. So based on the fact that DFS does not give us the best solution, we just want it to return the goal as soon as possible. 

Not using reached table:

             It's true that if we forego the reached table entirely, this could turn out to have us explore a node more than once and allow for loops to occur. We know that it sounds a bit stupid because this is something that we definitely don't want to happen. But there is a good reason for not using a reached table, that's because of the space complexity. We will talk about this later.

 

The following is the modified part for DFS from best first search pesudocode.

while not IS-EMPTY(frontier) do

    node <- Pop(frontier)

    if problem.IS-GOAL(node.STATE) then return node

    for each child in EXPAND(problem, node) do
    
        s <- child.STATE

        if problem.IS-GOAL(s) then return child # early goal test before add child node to frontier
        
        if s is not in reached then # DFS don't care about cost so we forego the second condition

            add child to frontier # we could probably not use the reached table

    

  

 (compared with the original one)

 

Applet for self-studying: http://www.aispace.org/search/index.shtml  (The early goal test is not implmented in this applet)

Just follow the tutorial to experience different search method.

 

branching factor: a node can have b children nodes at most.

Time complexity of search algorithms asks how many nodes do we have to explore in the worst case. The worst case happens when we're going down the tree one branch at a time going from left to right, but the goal is all the way down at the lower right porter. In this case, we would have to explore the entire tree with all nodes in the tree. The time complexity would be exponential.

Space complexity

Assumes we do not use reached table, then the frontier would be the main role to influence space complexity. The worst case would happen when the current node is in the deepest tier, the frontier only have to keep track of the immdiate siblings of each of its ancestors that we pass through. In other word:

The maximum size of the frontier occurs when DFS is expanding a node in the deepest layer of the search tree (a leaf). If it is there, it has necessarily expanded that node's parent and all of its ancestors going back up to the root. There are m such nodes. For each of those ancestors, the fact that they've been expanded means that all of their children have already been added to the frontier. Since the branching factor is b, that gives us O(bm) nodes in the frontier at that point in the search.

Once the leaf node has been processed, DFS starts to "unwind" and work its way back up.

(If we keep the reached table, when the goal node is in the lower right corner, we have to expand all nodes in the tree, which means all nodes all in the reached table, leading to the space complexity increases to b^m)

DFS is appealing just because of the space complexity (linear) especially when we don't have much storage space. The trade-off would just be the fact that one might encounter loops.

Completeness

Even without loops happen (not using a reach table), we're still not guaranteed to find a solution by DFS. If the state space is infinite (and different), the search will just keep going down all the way and never recognize.

Just like the example below, the agent is just going into a path (red arrow) with infinite and different states, it would never try to come back or stop, so it would never meet the goals (red points, it's valid to have more than one goal state!)

Optimality

No, since DFS does not consider true cost.


We are guaranteed to find a shallowest solution just because we're searching layer by layer.

Time complexity is O(b^d) because in the worst case we still never need to go beyond the shallow solution. 

Space complexity is O(b^d) because in the worst case the shallow goal state is in the last place of the tier d (Note that the tree could be deeper that d, we're just refering to the very last layer where the shallowest goal exists), we potentially have to expand all of the nodes in this layer and add the children of each node to the frontier. In depth d, there are b^d nodes, with branching factor b, the number of child nodes is approximately b^{d+1} and they are all added in the frontier.  So the space complexity is O(b^d).


Iterative-deepening Depth-limited DFS is combining the pros for both DFS and BFS. 

  • We are doing DFS search for each iteration, it means we don't have to keep a reached table, so the space complexity keeps to be linear.
  • We set a limit for depth in each DFS iteration, it means we would stop searching downwards when the depth of the current node reach the limit we set, so we're guaranteed to never run into infiniter loops. We don't have to worry about the non-completeness of this search!
  • In fact the iteration would end when the limit L grows to the depth of the shallowest solution, behave just like BFS.

And we still have a trade-off for using this method: We will be re-doing some work. Just imagine when we search in the depth like limit = 2 and 3, many nodes are the same but we are repeating checking them. In each iteration, we just throw away all information about nodes we explored and start a new work, that's the key point for keeping the space complexity in polynomial (linear).

But this wasted effort in upper levels of search tree is not a big deal, why?

Because the tree grows exponentially, when a search tree is big enough, most of the work is going to be in the bottom layer of the tree where the solution is in. So even if you re-do the search-up a bunch of times, it's still going to be a lot smaller than the number of new nodes you have to explore.

So the extra work you're doing with iterative deepening is little bit more but it's not going to be too much work that will change the time complexity.


KEY POINT: 

  • We will consider COST here
  • Function f is the CUMULATIVE path cost
  • We MUST use a reached table and we CANNOT use early goal test

Using the example, assuming we start from S and e is our goal:

 -->-->

Expand:   S                                                    Expand:   p                                                Expand:   d

  • Note that we should calculate cumulative cost, for node p, the cumulative cost is 1 + 15 = 16, and it would be recorded in the reached table.
  • We could see that we meet node e twice: when expanding node S, the cost of e is 9, then when expanding node d the cost of e is 3+2 = 5, which is cheaper than the previous one. The cost of 9 is with the path: S ->e, the cost of 5 is with the path: S -> d -> e.  We would definitely prefer the path S->d->e because it's cheaper. 
  • Remember that the path information is contained in node structure, with each node records who its parent node is, the path is not recorded in reached table or somewhere else.  

Note that in the pseudocode : 

When d expands and we get e again, node e is already in the reached table, so we will check the second condition and make a comparison, we find that the current cumulative cost of e is indeed smaller than the one recorded in the reached table, so now the following code means we should update the cost information for node e in the reached table, and re-add node e into frontier. 

Then it would lead to the situation that there would be 2 nodes with the same name 'e' but different cost.  Would it be a problem?

The answer is No, let's consider two cases:

  1. if Node e leads to the goal (node e is a member of the optimal path). Once we update the cost of e in reached table, and re-add the node e to the frontier, we should know that the two node are actually different. Recall the node is a structure, each node should track its corresponding state, parent, prior action, and total cost so far. We can conclude that the nodes only have identical state, but the information about parent and total cost are different!  It means that we just treat the two e nodes as independent items in the frontier. If so, we know the priority of nodes for popping is a node with the lowest cost,  so the cheaper e node would be popped earlier than the previous node e. According to the pseudocode, once a node is popped from frontier, we would check it with IS-GOAL function. Since we assume e is indeed the goal, then it would return the goal, indicating a success. This case we would probably not have to worry about the more expensive one interfering with the optimal path.
  2. if Node e does not lead to the goal. We somehow eventually pop the more expensive one after the cheaper one is popped. We don't need to worry about this case either bucause we just said the node e is not in the optimal path anyway, so we don't care about its cost when popping. 

As a result, it's possible that we would be doing some redundant work, but it wouldn't necessarily attract us away from an optimal path!

 

We have to keep in mind the similarity between UCS and BFS. We know Breath First Search goes tier by tier down the tree. And Uniform Cost Search goes in the same operation, but instead of tiers it goes by cost contour. Just like the graph below, the 'layers' in UCS is sort of unflat, each layer corresponds to nodes with cost \leq 1, 2, 3 ... which are cumulated from start node.

C^* / \epsilon give reference of how many tiers the solution path has to traverse at least. It would be a bound of depth for the goal, consider when all costs are 1 , so \epsilon = 1, the search would be just BFS and C* would be exactly the depth for the solution to the goal. 

As for the Time / Space complexity, we know it's O(b^d) for BFS, so we just replace the depth variable with the parameter we use in UCS:

(The plus 1 term in the slide could be ignored since it is a constant so it would not change the fact the the Time / Space complexity is exponential.)

The completeness of UCS carries from BFS. The optimality would be mentioned later. 


From DFS , BFS to UCS, we finally find a search algorithm that returns optimal solution, that's great but we can still improve upon this process. This is based on the fact that the previous search methods we mentioned are all uninformed, it means we have no idea how far we are from the goal state during the search or which direction to head toward to reach the goal. 

As a result, we will suffer from the exponential time complexities, especially when we are dealing with big problems, then it may not be efficient enough for using UCS though it returns optimal solutions. If we think of Uniform Cost Search as expanding outwards, it's not uniform in terms of depth but in terms of contours of costs. We're going to expand these circles or these tiers around our start state until we hit the goal state along one of these radiis. But along the way we get to the goal, UCS is sort of exploring uniformly and going through somewhat useless directions. There are lots of waste of efforts in these directions of the search tree which have really no relevance to the goal. That's the shortcoming of UCS.

                          

 UCS                                                                                                                                Informed Search

Often time in real problems, we do have some knowledge about the estimate of how close to the goal given a current state. The information like this is called domain-specific heuristics

In heurisitc search, we're still trying to expand outwards but with a sort of targeted direction. We could see the graph on the right that it's ovals rather than circles.

Then we will introduce A* search. Again It just baically borrows and replicates everything we talked about about the best first search, we just replaces the priority rules for popping in the frontier.

Now the function f would be the sum of g and h. 

g(n) is the comulative cost from the initial state to the node n just like what we do in UCS. h(n) is the heuristic, it's the estimate of how expensive it will be to go from the current node n to the goal. When we add these two together, that's essentially giving us an estimate of the cost for the entire path from the start state to the current node and then continuing from the current node all the way to the goal state. 

g(n) is sometimes called the backward cost because the cost it represents was incurred from the start to the where we are. h(n) is called forward cost because it gives the estimate of cost from where we are to the goal. A* search uses the combination of the two for priority function in the priority queue.

Let's see an example:

-->-->

-->-->-->reach Goal!

After we expand node a, we could see that the next node to be expanded will be node f. If we just compare it with UCS, UCS would only look at the g values, then node b would be the next node to be expanded, because  node b has lower backwards cost of 2. 

So hopefully the heuristics makes sense since node f is actually closer to the goal than node b. Heuristic function assigns a cheaper forward cost to node f to let f be prioritized than node b.

(You could also draw the graph above in the applet so and visualize what's going on step by step, it would be really useful! )


There're infinitely many heuristic functions out there for a problem but not all of them are going to work to make A* solution optimal. So we do need some restrictions on them to ensure that we actually get the right ouptut solutions.

When we consider only true cost (no heuristics), the optimal path would be S -> A -> G with total cost of 4. 

Now consider what A* would do here: 

-->reach the goal!  Then the path that A* gives is : S -> G

But we know that the true optimal path would be S -> A -> G, so A* returns a solution that is not optimal.

The problem here is about heuristic function. It's the heuritic value at node A that makes A look more expensive than it should be. The heuristic function assigns 6 to A as the estimate from A to the goal, but the actual cost from A to G is 3. So the heuristics over-estimates the true cost here. We say the heuristics gives us a pessimistic estimate that make it more expensive than it actually is. Once we over-estimate the true cost, the optimality would no longer be guaranteed.

So a good heuristic funtion should under-estimate the true cost, it would be even better if they're equal but it's not always possible since we don't know the true cost before the search ends.

What we want is the property called admissibility:

Again, the true cost h* is never known. If we knew the true costs, then we probably would have solved the problem already.

In the path-finding problems, we could use Euclidean distance as a heuristic function. Because we know that from city A to city B, the Euclidean distance (straight line between A and B) would be guaranteed to be the shortest distance. The actual path from city to city will not always be a straight line, so Euclidean can always under-estimate the true cost in this example.

Another example:

We've mentioned this misplaced tiles before: 

  • It's a game kind of like 华容道, we have to make all of tiles in the right place (goal state), the goal is as shown on the left above. So h( left ) = 0.
  • When we move a numerical tile to the blank tile (the black one), we're actually exchanging their places, so we could consider it's the blank pile that is moving in each step, just for convenience of coding. 
  • The relaxed problem will be free of some restriction of the original problem, here we say in a relaxed version, one misplaced numerical tile could move directly ( instantaneously teleporting ) to the correct location in one move. So the heuristic function would be how many tiles are misplaced in a given state. For example, on the above right, only tile 2 is in the desired place, the rest of tiles are all misplaced, so h( right ) = 7.

Another example of grid problem:

The Manhattan distance in grid problems means you can only traverse up down left and right but no diagonal displacement. In the context of the relaxed problem in the slide:

... ...... ...... ...      h(start) = (sum of tiles from 1 to 8) = 18                           

 

Comparing the Misplaced tiles and Manhattan distance of the same game, we could see the Manhattan gives a higher estimates than Misplaced tiles since Manhattan still keeps some more restriction than it. The truth is that we indeed want the heuritics to be under-estimating the true costs but hopefully not by too much. The closer the heuristics are to the true cost, the more efficient the informed search algorithm would be.

This gives us the notion of domination:


 

Suboptimal goal means the heuristic value of this node is 0 but the backward cost would be greater than other goal node.

(The graph in the slide means this situation cannot happen! B cannot be expanded before A! )

For an informed search, the theoretic limitation has not changes, it means we could still encounter the worst case of exponential time/ space complexity. But in practice we can make the problem easier by deciding what heuristic function we can come up with.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值