# Dealing with Complexity Through Search

• 搜索：利用手电筒或船，找到你的方法。
• 分析算法的效率。
• 递归关系，匹配数据类型与算法。

## Lesson 4

Lesson 4 - Udacity

### Course Syllabus

Lesson 4: Dealing with Complexity Through Search

### 1. Water Pouring Problem

I’m going to begin this unit with an old problem known as the “water-pouring problem.”

Here’s what we’re given: two glasses of water and we have a faucet(水龙头) in a sink(in a sink 在池子里), which can be the source of as much water as we want.

Now, these glass are of different sizes. I haven’t drawn them that much different, but this one is 4 oz(盎司), and this one is 9 oz. For those of you in the rest of the world besides the U.S., an ounce(ounce 盎司) is about 30 mL(milliliter 毫升).

(要解决的问题)
Our goal is to measure out(measure out 量出,配出) a specific(具体的;特种的;明确的) amount of water. What we want to have is 6 oz of water measured out. Six ounces won’t fit in this glass. The idea is at the end want to have this glass filled with exactly 6 oz of water.

There’s no gradated(逐渐转化,顺次排列) markers(marker 标识,标记;特征). It’s not like a function.graduated cylinder(圆柱) or measuring cup where we have the measurements on the glass. It wouldn’t be accurate enough to just eyeball(叮嘱看;打量) it. What we’ve got to do is we’ve got to figure out how to do that by measuring out a precise(清晰的;精确的) amounts into the cups and pouring them off.

For example, if the goal had been 5 oz, then that would have been easy. We’d just fill the 9 oz all the way up to the top, and then pour the 9 oz into the 4 oz until the 4 oz is all the way full, and then what would be remaining here because there’s 9 altogether would be 5 in this glass. Five ounces is easy. Six ounces is not as obvious how to get there.

(pouring actions的含义)
The puzzle is to find a sequence of pouring actions, and the pouring can be from one glass to another. It can go in the other direction. It can go from the faucet(水龙头) into each of the glasses. And it can go from the glasses down the drain(排水系统). Six different actions we can take, and we want to find a sequence of actions that arrives at this goal of 6 oz. Of course, we can generalize(推广;使一般化) the problem and put any number rather than 9 and 4 and 6.

(概念清单)
(杯子 glass，以及杯子的容量capacity、当前的水量current level；

As usual, let’s make our inventory of concepts that we’re going to be dealing with. We have the glass, and the glass has a capacity and a current level. This glass would have capacity 9, current level 5. We’re also going to need collection of glasses probably–a pair of glasses. I guess we can say that the pair of glasses and they’re current levels represents a complete state of the world. We’ll think of that as a state of the world. Everything we need to know about where we are in the problem.

(概念清单——继续)
(倒水动作pouring actions：倒空emptying、装满filling、转移transferring。
transferring有2种方式。以容量为9和4的杯子为例。1种是从杯子9往空杯子4倒，直到空杯子4满了；另1种是，从杯子4往空杯子9倒水，直到杯子4空了。)
Then we have a goal that we’re trying to reach. We have the pouring actions–1, 2, 3, 4, 5, 6. That breaks down into emptying(倒空), filling(装满), and transferring(转移). The transferring, I think, is a little bit tricky(复杂的), because there are two ways to do it. When we were transferring from the 9 oz into the 4 oz– so we transfer from x to y–we can do that until y is full. That’s what happened here. The 4 oz was full. Or we could do it until x is empty. If we were starting to pour back 4 oz from here into an empty one, we could do it until it was empty.

(解决方案的概念：步骤的1个序列)
(倒水的步骤的1个序列：a sequence of steps)
Anything else in the inventory? Oh, well, we certainly need a notion(概念) of a solution. A solution is going to be a sequence of steps– to pour from here to here, then from here to the drain(排水系统), then fill up, then pour again, and so on.

What this unit is really all about is techniques for finding these solutions, which are sequences of steps.

(当这个问题中的解决方案的序列很长时，如何管理复杂度)
Again, we’re always talking about managing complexity in this class. The complexity we’re trying to manage here is a complexity that comes when the sequences are long.

### 2. Combinatorial(组合的) Complexity

#### 视频下方的补充材料——结束

There’s a complexity that comes from combinatorial(组合的) problems. We’ve seen that before.

In the cryptarithmetic problems ODD + ODD = EVEN. We had these up to 10! different permutations of digits to assign(分配;赋值), and it was complex because we had to consider them all. In the zebra puzzle we had 5!^5(5!的5次方) combinations to consider. It was complex because it took a long time to consider them all. We came up with an optimization to consider a few of them by going one at a time.

For our pouring problem, we know there are 6 actions, 2 empties(流空), 2 fills((使)装满), and 2 pours(倒). The glasses are of size 4 and 9. The goal is 6 oz. I guess my question for you is how many combinations do we need? For cryptarithmetic it was 10!. For zebra it was 5!^5^. For pouring is it

• 6^4(6的4次方)
• 6^(9-4)
• 6^6
• 6^9
• can’t tell
• none of the above?

(我想不出来，看看Peter的答案吧)

#### 2. Combinatorial Complexity Solution

(之前的 ODD + ODD - EVEN 和 ZEBRA 问题中，变量的数量是固定的，每个变量有多少种数学排列是固定的。)
The answer is that you can’t tell. This is a different type of combinatorial(组合的) problem than the previous ones. In the previous ones we had a fixed(固定的) number of variables, and we knew how many combinations we had for each variable. In the zebra problem, there were 25 variables, and that’s all there was. We could enumerate(列举,枚举) all the combinations.

(倒水问题中，我们把从1个状态到下1个状态的倒水动作的1个序列放到一起。序列的长度是未知的，每个位置有6种不同的方式选择。)
For the pouring problem we’re trying to not fill static variables but rather put together a sequence of actions to go from one state to the next. We don’t know how long that sequence is, and of course, at each point we have 6 different options of different ways to go. From each of those 6 more. We know it’s going to be roughly(粗略地;大致上) 6 to the something, because we branch(分支形成;分支扩张) 6 at each point, but we don’t know what that x is, because we don’t know how long the sequence is. So that makes the problem slightly different.

If we want to be formal(正规的), we call it a combinatorial optimization problem(combinatorial optimization problem 组合最优化问题), but usually we just called it a “search” problem.

### 3. Exploring the Space

Now it’s called search traditionally, but I think “exploration” is a better name for it. We start out at home, and in this case our home is where we have two glasses. Zero and zero are the values for how full the glasses are(0和0是玻璃杯中装了多少水的值).

Then we start to explore. One way we could explore is to fill one of the glasses Then we’re at this state–say we’re at 0 and 4– but we know that there are other actions in which we could explore in other directions. Now we could take one of the other states and explore from there in other directions. We have lots of choices going forward of this huge space that we’re exploring.

Now, somewhere out in this space–and we don’t know which direction it is– is this goal state, which has 6 and then actually any amount in the other glass. We’re trying to reach that, and we’re distinguishing this part of the state space as a goal. So I drew this as one, but really it’s a collection of states in that every state that has 6 on one side and anything on the other should be considered part of this collection of goals. We’re trying to search forwards towards that.
(这里看了几遍终于看明白了。目标应该是1个集合，例如，1个瓶子中水量为6，另1个瓶子中装了任意数量的水，总体而言，应该作为目标的集合。)

One reason I like to call it an exploration problem is because we can think of going forward, exploring a new land, and part of that exploration is that we’ve got a frontier(边界). Here’s all the states that are the farthest out that we’ve gone. If we want to make progress towards the goal, then we’re probably going to have to step from one of the frontier nodes farther out. We’ve separated the set of all possible states into the goal state, the frontier states, and the previously explored states.

Then you can see that the way to make progress is to say let’s take one of the frontier states and expand that, and we have the advantage here of being a computer that an individual explorer doesn’t have. An individual explorer has to take one path, and if they decide they’ve gone in the wrong direction, they have to go all the way back. A computer can store lots of states in memory. Computer exploration is more like a collection of explorers all collectively(全体地,共同地) expanding(展开的,扩大的) the frontier.

(探索目标goal的过程：扩展边界frontier state，直到与goal state重叠。)
Our next move can be to say we’ll take one of these explorers, say the one in this state here, and say now tell me what’s next. You’ve got 6 actions from there. Where do they go to? Maybe some of them explore the world and generate new states that we haven’t seen before. Maybe some of them go to a state that we already know is on the frontier. Maybe some of them regress(倒退;退回) backwards into previously explored territory(领域,范围). But we can keep on going, expanding out our frontier until eventually the frontier keeps on expanding. When it overlaps(重叠) the goal, then we’ve got a solution.

(像以上的那样探索目标，存在2个难题
1. 无解时，报告反馈impossible
2. 从frontier到goal，存在1条路径，要确保在可以适当的时间找到，也就是探索的方法必须是有效率的，不能困在无限循环里面。)
Now, in exploration problems like this, there are two problems that we have to worry about.

One problem is that there is no solution at all, that the goals are not connected to the to start state. So there’s no path from here to there. Then what we want to do is do the exploration we need and report back that it’s impossible. We want to find out that it’s impossible.

Then the other problem is if there is some path that eventually makes it to the goal, We want to make sure that we find that in a reasonable(合理的;适当的) amount of time. That means we want to be efficient about the way we explore the space. It also means that we don’t want to get stuck in an infinite loop.

Now, if there is a finite number of states and they are connected, then we should be able to find the path. But if we aren’t clever, we may miss the solution even though it’s possible to find it.

(举个例子，2个杯子来回倒，会困在1个无限的路径里，且不会取得任何进展。)
For example, if we had a strategy that says first I’m going to explore in this direction– say this is pouring from cup x into cup y– and then I go in this direction, pouring from cup y back in to cup x, and then I pour the water back again–so I’m continually just taking water and pouring it between two different cups back and forth, those are all legal steps to take, but I’m ending up with an infinitely long path and I’m not making any progress.

(需要用于探索的1个策略)
We’d like to come up with a strategy for exploration, and the strategy corresponds to deciding which path to expand next. Strategy is always there’s some path–let’s say this one– and we say that’s the one we’re going to explore from next.

(避免无限循环的一些可能的办法)
To avoid this type of infinite loop, here’s some possibilities.

One possibility would be don’t reverse an action. If you come from state A to state B, don’t allow the action that goes immediately back to state A.

Another strategy would be to say always take the shortest path first. Out of all the paths that you’ve built so far, when we go to choose which one we’re going to expand next, always choose one of the shortest ones. That way we might start to build up an infinitely long path, but at least we won’t continue it. First we’ll do another one before we do that one.

Then another strategy would be don’t re-explore. That is, if we’re on the frontier–let’s say we’re here on the frontier– and we have a move that moves us back out of the frontier into the previously explored zone, then we should not allow that path.

My question is check all the strategies that would eventually lead us to the goal. Don’t worry about the efficiency of getting to the goal, but which one will eventually get us there and won’t get stuck in an infinite loop.

• don’t reverse
• don’t re-explore

#### 3. Exploring the Space Solution

The answer is shortest first would work. If there is a path, it’ll eventually find it. It will waste some time repeating itself, and may not be the most efficient. But we’ll get there.

Don’t re-explore seems more efficient, because it stops off some of these paths.

(不倒退这种办法，不能消除较长的无限循环)
Don’t reverse(倒退,反转) isn’t quite good enough, because if we said, okay, we’re going to eliminate(排除) the steps that go from A to B and then back to A, but that doesn’t stop us from going from A to B to C to D and then back to A and having that longer loop and having that be infinite.

### 4. Pouring Solution

Now let’s get to solving the problem and coding it up.

But before I do that, I want to introduce one more piece of jargon(行话;行业术语), which is if I’m at a particular state, and I decide that that’s the endpoint of the path that I want to expand, and I come up with the states you can get to from there by expanding the path and the steps that it takes to get to those states. I call that the successors to this state(successor 接替的人或事物;继承人,继任者).

The successors are a collection of states that you can reach and the steps that it took to get there.

Here is my solution. It’s a little bit complicated. Let’s go through it step-by-step.

def pour_problem(X, Y, goal, start = (0, 0)):
"""X and Y are the capacity of glasses; (x,y) is current fill levels and
represent a state. The goal is a level that can be in either glass. Start at
start state and follow successors until we reach the goal. Keep track of
frontier and previously explored; fail when no frontier."""
if goal in start:
return [start]
explored = set() # set the states we have visited
frontier = [ [start] ] # ordered list of paths we have blazed
while frontier:
path = frontier.pop(0)
(x, y) = path[-1] # Last state in the first path of the frontier
for (state, action) in successors(x, y, X, Y).items():
if state not in explored:
path2 = path + [action, state]
if goal in state:
return path2
else:
frontier.append(path2)
return Fail
Fail = []

(我自己的注释：

X、Y是杯子的容量；goal是1个整数，代表我尝试要让某个杯子装的水量，可以是2个杯子中的任意一个；start是初始状态，缺省值为(0, 0)，含义是2个杯子当前的水量current level都是0；x、y代表2个杯子的当前的水量current capacity。

>>> 0 in (0, 0)
True
>>> start = (0, 0)
>>> [start]
[(0, 0)]

path是状态的1个变化和1个箭头，会给每1个(倒水)动作1个名字，and then the other states that it goes to, and we alternate(交替;轮流) out with the states、action、 states and so on。

explored = set()，保持对我们已经探索过的状态states的跟踪，explored状态states的1个集合set

frontier = [ [start] ]，保持对边界frontier的跟踪。

>>> [[ start ]]
[[(0, 0)]]

frontier也可以是1个set，但是我们将1次从frontier中拿走1个项item(path = frontier.pop(0))，所以选择使用有序列表。

frontier路径paths的1个有序列表
The only path we have so far is the trivial(无价值的;不重要的) path that says we’re starting at the start, and we haven’t gone anywhere else yet. That’s what we start our frontier with。

    # 当frontier不为空时，
while frontier:
# path为frontier的第0个元素，frontier去掉第0个元素
path = frontier.pop(0)
# 2个杯子的当前的水量state，即(x, y)，是path的倒数第1个元素
(x, y) = path[-1] # Last state in the first path of the frontier
# 循环迭代遍历successors(x, y, X, Y)中的每1项
# successors(x, y, X, Y).items()
# 提供了所有可及的状态states和倒水动作actions
# 一共有6个
for (state, action) in successors(x, y, X, Y).items():
# 如果当前水量state不在explored集合中，即，state是新的，
# (如果state已经被探索过explored，那就没事可做)
if state not in explored:
# 那么，explored集合就增加这个state
# 并且，构造1个新的path2
# 新的path2，包含旧的路径path和[action, state]
path2 = path + [action, state]
# 如果目标goal在当前水量state中，
# (也就是说，水倒来倒去，当前的水量state，符合了目标goal)
# 举个例子，6在6和3里面
if goal in state:
# 那么，返回这个path2
return path2
# 如果目标goal不在当前水量state中，
else:
# 那么，将path2添加进frontier。
'''
Otherwise, just add this path onto the frontier,
and we'll pull something off the frontier later.
'''
frontier.append(path2)
# 如果frontier搞空了，仍然没有找到goal
# 那么，就返回Fail
return Fail
# 你可以将Fail置为None，但Peter决定将其置为空列表[]，
# 因为我们返回的所有其他的东西，都是列表。
Fail = []

)

I’m saying the input to this pour problem function are X and Y, which are the capacity(容量) of the glass for that. Then the goal, which is going to be an integer, like 6, to say that’s how much I’m trying to get to. That can be in either one of the glasses. Then the start state, which I’m defaulting to 0 and 0, saying both glasses have current level 0, but if you wanted you could generalize the problem and pass in something else as what we’re starting with. I’m using lowercase x and lowercase y to indicate the current capacity(容量) of the glasses.

Here I check and see are we done before we even get going?

Did you give me a start state and say the goal is the have a glass with zero in it? Then we’re done before doing any actions. Go ahead and return that. What I’m going to return is called a “path.”

The path is a alteration(变化,改变;变更) of states and an arrow(箭头记号), which will give a name to each action, and then the other states that it goes to, and we alternate(交替;轮流) out with the states action states and so on.

Here, if there’s nothing to do, it’s just a state with no actions. We’re going to keep track of the states that we’ve already explored and that’s going to be a set.

We’re going to keep track of the frontier. Conceptually(概念地), that’s a set too, but we’re going to pull the items off of the frontier one at a time, so I’ve made it an ordered list rather than a set.

I know which element of the frontier I want to explore first. So the explored is a set of states, and a frontier is an ordered list of paths. The only path we have so far is the trivial(无价值的;不重要的) path that says we’re starting at the start, and we haven’t gone anywhere else yet. That’s what we start our frontier with.

While the frontier is left, while there is still frontier states that we haven’t explored from yet, we pop off the first one. Pop(0) says take the 0th element of the list, so we’re going to pull elements off of the front of the list and push them onto the end of the list. Then say the current state is the last element of the path, so the path goes from one state to the next, and the last element of the path is the current state. Let’s take x and y from there.

Then I’ve defined a successor function that gives me all the successor states and the actions we used to get from there. There should be six of those.

Then if we say if that new state is not explored then it’s something new. If it was explored, there is nothing left to do. We’re already explored from there. If it hasn’t been explored yet, then add it to the explored set, make up a new path, which consists of the old path plus we follow an action to get to the new state.

If the goal number is somewhere in that state, so the goal is 6 and the state is the two levels of the glasses, say 6 and 3, yes, 6 is in 6 and 3. Then we’re done. Return that path as the winner, the path that reached the goal. Otherwise, just add this path onto the frontier, and we’ll pull something off the frontier later.

If we go all the way through and we run out of frontiers to explore from, then we can’t reach the goal and we return fail. You could have Fail be None. I decided to make it the empty list, because all the other things we’re returning were lists. Either way, None or Fail, both are equivalent(相等的;等价的) to False in Python if statements. So probably either one would do fine.

Here’s my successor function.

def successors(x, y, X, Y):
"""Return a dict of {state:action} pairs describing what can be reached from
the (x, y) state and how."""
assert x <= X and y <= Y ## (x, y) is glass levels; X and Y are glass sizes
return {((0, y+x) if y+x <= Y else (x-(Y-y), y+(Y-y))): 'X->Y',
((x+y, 0) if x+y <= X else (x+(X-x), y-(X-x))): 'X<-Y',
(X, y): 'fill X',
(x, Y): 'fill Y',
(0, y): 'empty X',
(x, 0): 'empty Y'
}

(我自己的注释

state是x、y组成的1个对子，代表中2个杯子中将要装的水量，以及如何到达那里的动作action。We’re just going to use strings to represent those actions, so it’s just something that we can print out that is otherwise unimportant in the operation of the program。

(X, y):'fill X'，把当前水量x装满，那就是将要装到它的最大值容量X，表示为(X, y)
(x, Y):'fill Y'，把当前水量y装满，那就是将要装到它的最大值容量Y，表示为(x, Y)

(0, y):'empty X'，把当前水量x倒空，那就是将要把它倒完为0，表示为(0, y)
(x, 0):'empty y'，把当前水量y倒空，那就是将要把它倒完为0，表示为(x, 0)

((0, y+x) if y+x<=Y else (x-(Y-y), y+(Y-y))):'X->Y'

((x+y, 0) if x+y<=X else (x+(X-x), y-(X-x))):'X<-Y'的情况与上面相似，不再赘述。
)

It takes the current levels of the glasses and the maximum capacity of the glasses. What it’s going to return is a dictionary of state-action pairs. The state is just an x-y pair of what the levels of the glasses are going to be, and the action is how you got there. We’re just going to use strings to represent those actions, so it’s just something that we can print out that is otherwise unimportant in the operation of the program.

First I wanted to check that this is a legal state that the fill level of x is less than its capacity and the same for y. Then I said here are the six possibilities. The pouring is complicated. Let’s do the filling first.

The filling says:

• You can fill X up to its capacity–capital X.
• You can fill Y up to its capacity–capital Y.
• You can empty X. That’ll become 0.
• You can empty Y. It will become 0.

Then the pour - there are two cases.

• If the total amount of water is less than y, then you can take all the water in the first glass, which is x, and add it into y, so you get y plus x. Same thing in the other direction.
• But if the total amount of water is more than the destination that you’re trying to pour it into, then you could only pour as much as will fill up the other glass.

We can see that there is conservation(保存;保护;避免浪费) of water here. The total amount is x + y minus this difference plus this difference.

I got the definition of my program pretty much just by following out the implications(implication 含义) of this diagram(示意图).

We’re going to keep track of an explored set, never try to return there, expand the frontier, pop off one element of the frontier, add in the new elements, and check when we get to the goal. Then that was all kind of generic(类的;一般的) for any exploration problem.

Then for the specific(具体的;特种的) water problem, the successor function and the way that was laid out(lay out 设计;展示) was specific to what we’re doing with the glasses.

### 5. Doctest

#### 视频下方的补充材料——结束

Now that was a lot of code again, so I’m really going to need some tests to makes sure I got this right.

Rather than write the types of tests that we had before with the search statements, I’m going to introduce a new type of test. This comes from the standard Python module called “doctest.” It stands for documentation test.

The idea is that you can write comments– the sort of comments that go with your class items and with your function items and then automatically have them run its tests. The tests look just like something that you would type into the Python interpreter.

The way doctest knows that you’ve got a test is you have three-arrow prompt(提示符)(这里指的是>>>), and an expression is input and the following lines are the output that comes back from that expression. It tests to see if what comes back when you run the test is what was expected.

Here I’ve typed in what I’ve done at an interactive session, what the results should be, and then when I make a change to my program I can run it again and make sure I haven’t messed anything up(mess up 弄乱;搅乱).

import doctest

class Test:
"""
>>> successors(0, 0, 4, 9)
{(0, 9): 'fill Y', (0, 0): 'empty Y', (4, 0): 'fill X'}

>>> successors(3, 5, 4, 9)
{(4, 5): 'fill X', (4, 4): 'X<-Y', (3, 0): 'empty Y', (3, 9): 'fill Y', (0, 5): 'empty X', (0, 8): 'X->Y'}

>>> successors(3, 7, 4, 9)
{(4, 7): 'fill X', (4, 6): 'X<-Y', (3, 0): 'empty Y', (0, 7): 'empty X', (3, 9): 'fill Y', (1, 9): 'X->Y'}

>>> pour_problem(4, 9, 6)
[(0, 0), 'fill Y', (0, 9), 'X<-Y', (4, 5), 'empty X', (0, 5), 'X<-Y', (4, 1), 'empty X', (0, 1), 'X<-Y', (1, 0), 'fill Y', (1, 9), 'X<-Y', (4, 6)]

## What problem, with X, Y, and goal < 10 has the longest solution?
## Answer: pour_problem(7, 9, 8) with 14 steps.

>>> def num_actions(triplet): X, Y, goal = triplet; return len(pour_problem(X, Y, goal)) / 2

>>> def hardness(triplet): X, Y, goal = triplet; return num_actions((X, Y, goal)) - max(X, Y)
>>> max([(X, Y, goal) for X in range(1, 10) for Y in range(1, 10)
...                   for goal in range(1, max(X, Y))], key = num_actions)
(7, 9, 8)

>>> max([(X, Y, goal) for X in range(1, 10) for Y in range(1, 10)
...                   for goal in range(1, max(X, Y))], key = hardness)
(7, 9, 8)

>>> pour_problem(7, 9, 8)
[(0, 0), 'fill Y', (0, 9), 'X<-Y', (7, 2), 'empty X', (0, 2), 'X<-Y', (2, 0), 'fill Y', (2, 9), 'X<-Y', (7, 4), 'empty X', (0, 4), 'X<-Y', (4, 0), 'fill Y', (4, 9), 'X<-Y', (7, 6), 'empty X', (0, 6), 'X<-Y', (6, 0), 'fill Y', (6, 9), 'X<-Y', (7, 8)]
"""

print(doctest.testmod())
# TestResults(failed=0, attempted=9)

For example, at the start here I just want to test out what are the successors of the start state with both glasses empty and when one glass has capacity 4 and the other has capacity 9. In general there are six actions but here a lot of them end up being the same, because if you pour zero into zero either way or if you empty out either of them, it all comes out the same.

We only end up with three states, and they happen to have these labels– (0, 9) filling Y, (0, 0)–we called that emptying Y, but of course emptying 0 gives you 0. It could have been the no opt(选择,挑选), but that’s just the way the successor function works out. Then (4, 0) is filling X.

More interestingly, if you have 3 and 5 and you fill– so this is testing when we aren’t exceeding(exceed 超过;超越) the capacity, and this test is when we do exceed the capacity. We can see they work out to the right numbers.

Then we solve a problem and come up with a solution and so on.

Doctest is a nice capacity to allow you to write tests this way. You can sprinkle(用…点缀) them throughout your program, and then you can run the test. Just say:

print doctest.testmod()

which stands for test module. If you give it no arguments, it tests the current module.

When I run this I get the comforting message that there’s a test result that is none of the tests failed, and there were 9 that were attempted.

Let’s go back and look at the solution.

    >>> pour_problem(4, 9, 6)
[(0, 0), 'fill Y', (0, 9), 'X<-Y', (4, 5), 'empty X', (0, 5), 'X<-Y', (4, 1), 'empty X', (0, 1), 'X<-Y', (1, 0), 'fill Y', (1, 9), 'X<-Y', (4, 6)]

I’m asking given glasses of levels 4 and 9 trying to find the goal 6. This is the shortest solution possible–fill Y, pour from Y into X, empty X, do the same, empty X again, fill Y into X again, fill Y, and pour from Y into X, and then we end up with a 6 in Y.

We can solve problems more generally.

>>> def num_actions(triplet): X, Y, goal = triplet; return len(pour_problem(X, Y, goal)) /

>>> def hardness()triplet: X, Y, goal = triplet; return num_actions((X, Y, goal)) - max(X, ...)

>>> max([X, Y,goal) for X in range(1, 10) for Y in range(1, 10)
for goal in range(1, max(X, Y))], key=num_actions)

>>> max([X, Y, goal) for X in range(1, 10) for Y in range(1, 10)
for goal in range(1, max(X, Y))], key=hardness)

>>> pour_problem(7, 9, 8)
[(0, 0), 'fill Y', (0, 9), 'X<-Y', (7, 2), 'empty X', (0, 2), 'X<-Y', (2, 0), 'fill Y', (2, )]


Here I’ve defined a function num_actions, which says given an X and Y capacity and a goal how long does it take to solve the goal–the total number of steps it’s going to take.

Then I asked here for all values of X and Y less than 10–for all capacities less than 10– and for all goals smaller than the capacity, what’s the longest? What’s the hardest? Which combinations(结合;联合体) of those takes the most actions? The answer was if you’re given glasses of size 7 and 9 and asked to pour out 8, that’s the hardest problem within that range.

### 6. Bridge Problem

Now let’s introduce another problem.

We have a cavern(大山洞;凹处) here with a rickety(连接处不牢固的,快要散架的) bridge connecting it.

On this side, which we’ll call “here,” we have a collection of 4 people who want to get to the other side, which we’ll call “there.”

Part of the problem is this is nighttime, and it’s dark. Fortunately, our team has a flashlight(手电筒;闪光信号灯) or a torch(火把,火炬;手电筒).

The setup(计划;组织) is such that the bridge is so rickety that only 2 people at a time can cross, so either one or two people can cross. It’s so dark that they need the flash light with them. For everybody to get across, two people are going to have to go across. One is going to have to come back with the flashlight. They’ll shuttle(以短程往复方式运送(货物等)) each back and forth like that.

Now, each of the people has different physical abilities(physical ability 身体能力) and fear(害怕) levels, so they each take different times to cross the bridge.

This person is speedy, takes 1 minute, the next 2 minutes, the next 5 minutes, and the last 10 minutes.

The question is what combinations of actions will get everybody across the bridge the fastest.

### 7. Representing State

Let’s take our usual approach– start making an inventory(清单) of concepts(concept 概念,想法) and figure out(figure out 弄明白;想出;解决) how to represent them.

(概念清单：

)
We want to represent a person, a collection of people, and probably it looks like we want to have two collections of people. One, the collection of people on the here side, and one, the collection of people on the there side. We also need to represent the light or the torch. From there it seems like that’s about it, and the other concepts we need are the concepts we already had of states and paths. Now, how about the representation choices. For person, well, I hate to reduce people to(reduce to 归纳) numbers, but in this case that seems like the perfect thing to do. This person, regardless of all his wonderful individual(个人的;独特的) qualities(quality 质量;能力), we can just represent by the number 5.

How about a collection of people? We could represent a collection as a tuple–1, 2, 5, 10– as a list, as a set. There’s also this data type in Python called a frozen set.

What I want you to tell me is of these four, which do you think would be okay for representations just in terms of being able to to manipulate(操作,处理) them and calculate the successors.

(人的collection，即here和there，必须是hashable的原因)
Which of these are hashable? Hashable is important, because if we’re going to use the same type of technique we used before for our search we had our explored set, which was a set of states, and members of a set have to be hashable. That’s a property that we might want to worry about.
(补充知识：Python中的hashable)

Now, I should say one more thing in that the description of the problem it was explicitly(明白地,明确地) stated that each of the people has different speeds. That bothered me a little bit, because I could certainly imagine two people having the same speed. But let’s just solve what we were asked to solve where every person has a distinct(有区别的;明显的) speed.

hashable    OK
□           □   tuple (1, 2, 5, 10)
□           □   list
□           □   set
□           □   frozenset

#### 7. Representing State Solution

The answer that all four of these representations would be fine. We can generate successors by appending or adding elements to set lists, tuples, or frozen sets. None of those is too hard to do. It’s a little bit easier with sets than with the other ones. In terms of hashing, the immutable data types–frozen sets and tuple– are hashable, and the mutable types–list and set–are not hashable.
(不可变类型是hashable，可变类型是unhashable)

hashable    OK
■           ■   tuple (1, 2, 5, 10)
□           ■   list
□           ■   set
■           ■   frozenset

### 8. Bridge Successors

Now, out of those many choices, I made a choice to say I’m going to represent the state as a tuple of (here, there, t), where “here” represents everything that’s on this side, “there” represents everything that’s on that side, and “t” is the total elapsed(elapse 消逝;时间过去) time since the start.
(here代表在here side的所有人；there代表在there side的所有人；t代表从一开始计时，总的消耗的时间。)

I’m going to represent here and there with frozen sets, because those are hashable. So this collection here would be the frozenset consisting of {1, 2, 5, 10}, and I’m going to just use the string “light” to represent the flashlight. There would be the empty frozen set.
(使用frozenset表示here和there，因为frozenset是hashable。)

Now, consider this state here representing the start state. What are the successors of that state? Well, any one of the people could go across. They’ve got to bring the light with them. In the successor state, the light will definitely(一定地;肯定地) be there, and it will not be here. It can only be in one place. At least one of the people will be over there and possibly two of the people, so all combinations of sending either one person or two people to the other side, those will each be distinct successor states. Let’s see–we’ve got 4 x 3 is 12, but order doesn’t matter, so there’s 6 of those. Then 4 more, so it looks like there should be 10 successor states.

def bsuccessors(state):
here, there, t = state
# your code here

What I want you to do is write for me the successor function. We’re calling it bsuccessors, because we already had a and we’re on to b. Or b could stand for “bridge.” Remember that a result of the successor function is the dictionary of state action pairs. A state is this (here, there, t) tuple. Here and there have to be frozen sets. The frozen sets contained people–1, 2, 5, and 10– and/or this light, indicated by the string “light.” Show me the function that will generate all the successors. Here I’ve given you a hint of here’s a way to break up the state into those three variables. Then put your code here.

Oh, one more thing I forgot is what are the actions. Well, let’s say that an action will be represented by the character string arrow going to the right if we’re moving from here to there and an arrow going to the left if we’re moving from there to here.

Peter留的代码题目：

# -----------------
# User Instructions
#
# Write a function, bsuccessors(state), that takes a state as input
# and returns a dictionary of {state:action} pairs.
#
# A state is a (here, there, t) tuple, where here and there are
# frozensets of people (indicated by their times(如1、2、5、10分钟)),
# and potentially(潜在地;可能地)
# the 'light,' t is a number indicating the elapsed(elapse 消逝;时间过去) time.
#
# An action is a tuple (person1, person2, arrow), where arrow is
# '->' for here to there or '<-' for there to here. When only one
# person crosses, person2 will be the same as person one, so the
# action (2, 2, '->') means that the person with a travel time of
# 2 crossed from here to there alone.

def bsuccessors(state):
"""Return a dict of {state:action} pairs. A state is a (here, there, t) tuple,
where here and there are frozensets of people (indicated by their times) and/or
the 'light', and t is a number indicating the elapsed time. Action is represented
as a tuple (person1, person2, arrow), where arrow is '->' for here to there and
'<-' for there to here."""
here, there, t = state

def test():

assert bsuccessors((frozenset([1, 'light']), frozenset([]), 3)) == {
(frozenset([]), frozenset([1, 'light']), 4): (1, 1, '->')}

assert bsuccessors((frozenset([]), frozenset([2, 'light']), 0)) =={
(frozenset([2, 'light']), frozenset([]), 2): (2, 2, '<-')}

return 'tests pass'

print test()

#### 8. Bridge Successors Solution

(我把注释中的内容读了几遍，仍然没有头绪，还是看看Peter的答案吧)

def bsuccessors(state):
here, there, t = state
if 'light' in here:
return dict(((here  - frozenset([a, b, 'light']),
there | frozenset([a, b, 'light']),
t + max(a, b)),
(a, b, '->'))
for a in here if a is not 'light'
for b in here if b is not 'light')
else:
return dict(((here  | frozenset([a, b, 'light']),
there - frozenset([a, b, 'light']),
t + max(a, b)),
(a, b, '<-'))
for a in here if a is not 'light'
for b in here if b is not 'light')

(目前，代码中，’light’被和人放到一起，在set中，后续可以根据需要修改重构。

)
Here’s my solution. I’ve got to say that my solution came out a little bit more complicated than I expected it to. I think maybe I made a bad choice for the representation. I threw in the flashlight along with the set of people, because I figured you want one set to represent everything that’s on one side. But I’m think now after this came out the way that it did that maybe I should have had the flashlight be a separate part of the state. In other words, have the state be a 4-tuple, not of things that are here or there but of people that are here or there, then the time, and then a fourth element being the flashlight saying where is the flashlight. That could either be true or false, saying it is it here, or it could be a character string, saying it’s there or here, or it could be a integer–0 or 1. I think it might’ve been easier if I’d chosen one of those representations. But it didn’t bother me enough to go back and make a change. If you want to, you could spend time refractoring and change that. I’m going to just push ahead.

(

    # 如果'light'在here这一边：
if 'light' in here:
return dict(((here  - frozenset([a, b, 'light']),
# 上1行含义：
# a、b这2个人，从here向there移动后，
# here这边的人减去frozenset([a, b, 'light'])
# 下1行含义：
# a、b这2个人，从here向there移动后，
# there这边的人加上frozenset([a, b, 'light'])
there | frozenset([a, b, 'light']),
# 总的消耗的时间增加量为max(a, b)
t + max(a, b)),
# '->'代表a、b这2个人，从here向there移动。
(a, b, '->'))
'''
a、b代表一起过桥的2个人。
下2个for循环意思是，排除a、b为'light'的情况，
循环遍历迭代here中的每1个人，组合每1对人a,b。
'''
for a in here if a is not 'light'
for b in here if b is not 'light')

)

Here’s what I did. I said if the light is here, then let’s look at all the people in here. We’ll look at all the pairs of people–A and B. To make sure that they’re people, I have to say that they’re not the light. For all pairs of people A and B, we can generate a successor state, which is the set of people that were here minus the two people and the light, because the light is going to move from here to there. The second part of the successor state is everything that was already over on the other side on there unioned with the things that are coming over, which are people A and B and the light. Then the time is the time plus the maximum time that it took for A and B to get over. Then I know it says in the specification here that the action is represented just by an arrow. If I want to get the problem right I would do that, but then I decided later on that maybe the action should be more than just the arrow. Maybe the action should also tell who went across. I have the option of doing thing.

def bsuccessors(state):
here, there, t = state
if 'light' in here:
return dict(((here  - frozenset([a, b, 'light']),
there | frozenset([a, b, 'light']),
t + max(a, b)),
'->')
for a in here if a is not 'light'
for b in here if b is not 'light')
else:
return dict(((here  | frozenset([a, b, 'light']),
there - frozenset([a, b, 'light']),
t + max(a, b)),
'<-')
for a in here if a is not 'light'
for b in here if b is not 'light')

If I want to just solve the problem the way it was specified then I would return just the arrow to represent the action, and I would do the same thing over here. One subtlety(巧妙;细微的差别) of this that worked out well in my favor– it’s a little bit messy(凌乱的;复杂的) dealing with frozen sets. I don’t like the idea of that the name is so long, but I didn’t have to consider separately the idea of one person going across and two persons going across. Because we were dealing with sets, the set of people a, b when a is equal to b is equal to 1 person. I get the 1 person crossing for free. That’s one nice thing about my representation.

But notice that everything is in flux(不断的变动;流量) here. I’m trying to choose a good representation. I’m changing my mind as I go along. Should the actions be represented by a single arrow or should they be represented by an arrow along with the names of the people that are going? That’s all up in flux. I should say that that type of flux is okay as long as it remains contained. If you have uncertainties that are going to cross barriers(barrier 障碍;屏障) between lots of different functions, then probably you want to nail them down(nail down 钉牢;迫使(某人)作出决定). If you think that they’re contained, then it’s okay to have some uncertainty and be able to explore the exact options later.

### 9. 练习：Paths Actions States

def path_state(path):
"Return a list of states in this path."
return ## ???

def path_actions()path:
"Return a list of actions in this path."
return ## ??

Here’s a quick exercise. Why don’t we just define two functions–path_states, which takes a path and returns a list of the states, and path_actions, which takes a path and returns a list of the actions. The path is interleaved(交叉存取的,隔行扫描的)–it contains both–and we want to pull just the states or just the actions.

Peter的题目：

# ----------------
# User Instructions
#
# Write two functions, path_states and path_actions. Each of these
# functions should take a path as input. Remember that a path is a
# list of [state, action, state, action, ... ]
#
# path_states should return a list of the states. in a path, and
# path_actions should return a list of the actions.

def path_states(path):
"Return a list of states in this path."
return ## ???

def path_actions(path):
"Return a list of actions in this path."
return ## ???

def test():
testpath = [(frozenset([1, 10]), frozenset(['light', 2, 5]), 5), # state 1
(5, 2, '->'),                                        # action 1
(frozenset([10, 5]), frozenset([1, 2, 'light']), 2), # state 2
(2, 1, '->'),                                        # action 2
(frozenset([1, 2, 10]), frozenset(['light', 5]), 5),
(5, 5, '->'),
(frozenset([1, 2]), frozenset(['light', 10, 5]), 10),
(5, 10, '->'),
(frozenset([1, 10, 5]), frozenset(['light', 2]), 2),
(2, 2, '->'),
(frozenset([2, 5]), frozenset([1, 10, 'light']), 10),
(10, 1, '->'),
(frozenset([1, 2, 5]), frozenset(['light', 10]), 10),
(10, 10, '->'),
(frozenset([1, 5]), frozenset(['light', 2, 10]), 10),
(10, 2, '->'),
(frozenset([2, 10]), frozenset([1, 5, 'light']), 5),
(5, 1, '->'),
(frozenset([2, 10, 5]), frozenset([1, 'light']), 1),
(1, 1, '->')]
assert path_states(testpath) == [(frozenset([1, 10]), frozenset(['light', 2, 5]), 5), # state 1
(frozenset([10, 5]), frozenset([1, 2, 'light']), 2), # state 2
(frozenset([1, 2, 10]), frozenset(['light', 5]), 5),
(frozenset([1, 2]), frozenset(['light', 10, 5]), 10),
(frozenset([1, 10, 5]), frozenset(['light', 2]), 2),
(frozenset([2, 5]), frozenset([1, 10, 'light']), 10),
(frozenset([1, 2, 5]), frozenset(['light', 10]), 10),
(frozenset([1, 5]), frozenset(['light', 2, 10]), 10),
(frozenset([2, 10]), frozenset([1, 5, 'light']), 5),
(frozenset([2, 10, 5]), frozenset([1, 'light']), 1)]
assert path_actions(testpath) == [(5, 2, '->'), # action 1
(2, 1, '->'), # action 2
(5, 5, '->'),
(5, 10, '->'),
(2, 2, '->'),
(10, 1, '->'),
(10, 10, '->'),
(10, 2, '->'),
(5, 1, '->'),
(1, 1, '->')]
return 'tests pass'

print test()

### 9. Paths Actions States Solution

def path_states(path):
return [0::2]

def path_actions(path):
return path[1::2]

Here’s the answer. These are pretty easy functions. We’re returning every other one. The states are the even-numbered(even 偶数的) positions, and the actions are the odd-numbers(odd 奇数的) positions. Starting at 0, going to the end, every other one. Starting at 1, going to the end, every other one.

### 10. Bridge Solution

#### 视频下方的补充材料——开始

The items() method is a built in dictionary method which returns a list of tuples of the (key, value) pairs in a dictionary. You can read more about it in the http://docs.python.org/library/stdtypes.html#dict.items“target=”_blank”>python documentation.

#### 视频下方的补充材料——结束

def bridge_problem(here):
here = frozenset(here) | frozenset(['light'])
explored = set() # set of states we have visited
# State will be a (people-here, people-there, time-elapsed)
frontier = [ [(here, frozenset(), 0)] ] # ordered list of paths we have blazed
if not here:
return frontier[0]
while frontier:
path = frontier.pop(0)
for (state, action) in bsuccessors(path[-1]).items():
if state not in explored:
here, there, t = state
path2 = path + [action, state]
if not here: ## That is, nobody left here
return path2
else:
frontier.append(path2)
frontier.sort(key=elapsed_time)
return []

def elapsed_time(path):
return path[-1][2]

Now I’m going to show you the solution to the search problem rather than try to make you do it yourself, because there are still a few tricks(trick 诀窍;把戏) here that are different from the previous search problem.

I’m going to define problem, which takes a sequence of elements here. If you want, you can pass in a frozen set of {1, 2, 5, 10} or whatever, but if you didn’t I’m going to go ahead and do that kind of version for you. I’m going to make it into a frozen set, and I’m going to add in the light in case you forgot to specify(指定) that. You can just ask bridge_problem of the list 1, 2, 5, 10. I’ll take care of it all for you. Like before, the explored set starts off being the empty set. The frontier starts off being the one initial state, which is the frozen set we just made up for everything that’s on the here side, and empty set for everything that’s on the there side, and 0 for the elapsed time. The idea is to get everybody away from here onto the other side. If we were given a trival(无价值的) problem where there was already nobody here, then we’re done and we return that initial state. Otherwise, just like before, we start popping things off the frontier.

Just like before we’re looking at our successors, and the only difference is down here. Whereas before we put a path on the end, and we were expanding out our frontier and taking off the shortest path first from our frontier, because in the previous problem, in the water-pouring problem, the best solution was to find as the solution that was shortest, with the smallest number of steps.

In this problem, the best solution is defined as the one with the smallest elapsed time where the elapsed time of a path is the second element. That’s the t element here of the final element of the path. That would be the total elapsed time of a path. So we sort the frontier by the total elapsed time.

Now it is a little bit wasteful here that we’re going through this loop, we only added in one new element, and we sorted the whole thing. Python’s actually pretty good at that type of sort. There are other ways to make that more efficient, but just conceptually(概念地) that’s what we’re doing. We always want to have the frontier sorted, so that we’re taking the fastest time first.

print bridge_problem([1, 2, 5, 10])

print bridge_problem([1, 2, 5, 10])[1::2]
[(5, 2, '->'), (1, 1, '<-'), (10, 1, '->')]

I typed that program in, and I ran it for the very first time. Bridge_problem([1, 2, 5, 10]). I got an answer back. Remember, the answer is a path, which is an alternation of states and actions.

We can pick out just the actions, like this, by asking for the path and then taking a slice of that path, starting at element number 1, going to the end, and giving us every other element. That’ll be just the actions. Those are these three actions.

That’s my proposed solution that my program came up with. My question is is that correct? Yes or no?

(我也发现上面的有问题，5、2从here往there移动之后，1、1从there往here移动，这说不通啊。)

##      Is that correct?
## 0    Yes
## 0    No

#### 10. Bridge Solution Solution

The answer is no, that’s not correct at all.

I’ve been cheating a little along the way in that I’ve been showing you solutions that I got the second or third time once I’d debugged them and got them right. This time I wanted to show you a little bit of the debugging process. I got something wrong here. I don’t always get them right the first time. This is so wrong looks what’s happening. I said the first move is at the 5 and the 2 go across together. It seems like a perfectly reasonable move. They’re going from here to there. The second move was that the 1, by his or herself, comes back from there to here. But 1 isn’t even over there. How could 1 come back? I must have messed up(mess up 弄乱;搞乱) the successor function. Let’s take a look.

### 11. Debugging

def bsuccessors(state):
here, there, t = state
if 'light' in here:
return dict(((here  - frozenset([a, b, 'light']),
there | frozenset([a, b, 'light']),
t + max(a, b)),
'->')
for a in here if a is not 'light'
for b in here if b is not 'light')
else:
return dict(((here  | frozenset([a, b, 'light']),
there - frozenset([a, b, 'light']),
t + max(a, b)),
'<-')
for a in here if a is not 'light'
for b in here if b is not 'light')

Here’s the problem. I was careful about doing the here case. I made up this nice expression, but then I did a copy and paste, and I edited the expression, and I swapped around the here and the there in this part. When I created the new state, I did that correctly. But down here I’m iterating over the people that were here. I’m trying to have candidates move from there to here, and I’m iterating over people that are here. That doesn’t make any sense at all.

def bsuccessors(state):
here, there, t = state
if 'light' in here:
return dict(((here  - frozenset([a, b, 'light']),
there | frozenset([a, b, 'light']),
t + max(a, b)),
'->')
for a in here if a is not 'light'
for b in here if b is not 'light')
else:
return dict(((here  | frozenset([a, b, 'light']),
there - frozenset([a, b, 'light']),
t + max(a, b)),
'<-')
for a in there if a is not 'light'
for b in there if b is not 'light')

I’ve got to fix that. Now the question is is it going to run this time. I found a bug. I fixed it. Is the program correct now? Yes, no, or not enough information, you can’t tell yet?

#### 11. Debugging Solution

I think the right answer is that you just can’t tell. I’m hopeful that it’s going to work, but I know I fixed one bug. I don’t know whether there are other bugs lurking(潜在) in there.

### 12. Did it work

print bridge_problem([1, 2, 5, 10])

print bridge_problem([1, 2, 5, 10])[1::2]
[(2, 1, '->'), (1, 1, '<-'), (5, 1, '->'), (1, 1, '<-'), (10, 1, '->')]

Now I run it again. This is the path I get. These are the actions in the path. Let’s see if it makes sense. Now 1 and 2, the two fastest people, go over first, That looks like a pretty good solution. It came up with a total time of 19. The question is is the program correct now? Yes, it is. No, this example is wrong–there might be a faster example than this and it didn’t find it? Or, no, this example is okay. It is the fastest, but the other examples are wrong. Or you still can’t tell.

#### 12. Did it work Solution

The answer to that is that this example is actually wrong. It does get everybody across, and it gets them across in 19, but there’s another solution that’s faster than that. So let’s look at our program and see what we did wrong and why we missed the fastest solution.

### 13 Improving the Solution

Unfortunately, we got the wrong answer. Yes, we got a path that leads to the goal, but we didn’t get the fastest path. Let’s see what went wrong.

We had our start state, and then we started expanding that and moving out. That defined our frontier. Then we were very careful about sorting the elements on the frontier, and then we pulled off the very best, the one with the least cost. Then expanded out from there. Let’s say the cost of getting to the end of this path with 14, this one 15, this one 16. This is the lowest cost path, we expand that first. Let’s say one of the steps cost 5, so that gets us to this state with a cost of 19. Let’s say that is in fact a goal state. Now we just stopped there. We said we took off the least cost path. We expanded it. We found a goal. We’re done.

When we were looking for the shortest path in terms of the least number of steps, that was the right approach, but when we’re looking for the least cost path, that’s not the right approach. Because even though we pulled off the cheapest path here–the one with the lowerst cost– here’s another path that has a higher cost, but if we expand that there might be a step that only costs 2. We get to this state with cost 17 and that’s also a goal. So we made a mistake. We stopped here when we got this result that was 19 when we really wanted this result that was 17. I think the problem was we were prematurely(过早地) acting(行动;起作用). We said just because this was the fastest solution here, we went ahead and took one step away from the fastest and accepted that when that might not be the best answer overall(全面地;总的来说).

How can we fix this?

One possibility would be to exhaust(用尽,耗尽) the frontier. That is, we’ve got a frontier here. Even though we find a solution from the first element of the frontier, we keep going until we visit everybody on the frontier and give everybody a chance to find the better solution.

Another possibility is to give everybody one more chance. Once we’ve found the first solution, now we say, okay, everybody on the frontier gets one more step to see if they can find a solution.

(下面这一段看的不是很明白)
The third possibility would be to test later. That is, when we generate this solution, we don’t check right here to see if it is a solution. Rather, we just go ahead and throw this onto the frontier and only check to see if it’s a solution when we pull the next element off of the frontier. So,making the checks, Rather than when we generate a new node and we’re about to add them, do the checks later once we’ve pulled them off the frontier.

Now tell me which, if any, of these will work to give us this fastest solution.

how to fix

□   exhaust frontier
□   one step path
■   test later

#### 13 Improving the Solution Solution

(在这个特定的问题中，只有1个有限数量的states。但是在一些问题中，states可能会是1个无限的数量。)
The answer is exhausting(exhaust 用尽) the frontier won’t work, because the frontier might be infinite. In this particular problem, there’s only a finite number of states, but in some problems there might be an infinite(极大的;无限的) number. If we kept on generating new elements onto the frontier we may never get to the end.

(可能需要2步)
Doing one step won’t do it either. In this case, if once we found the solution from this 14, we then gave all the other guys one step, it would work in this case. But it might be that it took two steps. Maybe from the 15 there’d be one step that costs 1 and another step that cost 2. I might not just be one step, so that’s not going to work.

(看了好几遍，还是看不懂，先跳过吧)
The test later part will work. The reason it works is because now we’ve guaranteed that everybody on the frontier is sorted, and we’re pulling off the shortest one first. If we put it back onto the frontier rather than recognizing immediately that it’s a goal, then since we’re pulling them off in order of increasing cost, then we know that the first one we pull off the frontier that is a goal that must be the cheapest path to the goal.

### 14. Modify Code

(问题和解法我都看不太懂)
What I want you to do is take this is the same version of the bridge problem solver that we saw before, and I want you to modify this so that it tests for the goal later after pulling a state off the frontier, not when we’re about to put it on the frontier.

# -----------------
# User Instructions
#
# Modify the bridge_problem(here) function so that it
# tests for goal later: after pulling a state off the
# frontier, not when we are about to put it on the
# frontier.

def bsuccessors(state):
"""Return a dict of {state:action} pairs.  A state is a (here, there, t) tuple,
where here and there are frozensets of people (indicated by their times) and/or
the light, and t is a number indicating the elapsed time."""
here, there, t = state
if 'light' in here:
return dict(((here  - frozenset([a,b, 'light']),
there | frozenset([a, b, 'light']),
t + max(a, b)),
(a, b, '->'))
for a in here if a is not 'light'
for b in here if b is not 'light')
else:
return dict(((here  | frozenset([a,b, 'light']),
there - frozenset([a, b, 'light']),
t + max(a, b)),
(a, b, '<-'))
for a in there if a is not 'light'
for b in there if b is not 'light')

def elapsed_time(path):
return path[-1][2]

def bridge_problem(here):
"""Modify this to test for goal later: after pulling a state off frontier,
not when we are about to put it on the frontier."""
## modify code below
here = frozenset(here) | frozenset(['light'])
explored = set() # set of states we have visited
# State will be a (people-here, people-there, time-elapsed)
frontier = [ [(here, frozenset(), 0)] ] # ordered list of paths we have blazed
if not here:
return frontier[0]
while frontier:
path = frontier.pop(0)
for (state, action) in bsuccessors(path[-1]).items():
if state not in explored:
here, there, t = state
path2 = path + [action, state]
if not here:  ## That is, nobody left here
return path2
else:
frontier.append(path2)
frontier.sort(key=elapsed_time)
return []

def test():
assert bridge_problem(frozenset((1, 2),))[-1][-1] == 2 # the [-1][-1] grabs the total elapsed time
assert bridge_problem(frozenset((1, 2, 5, 10),))[-1][-1] == 17
return 'tests pass'

print test()

#### 14. Modify Code Solution

(问题和解法我都看不太懂)
Here’s the solution.

def bridge_problem(here):
"Find the fastest (least elapsed time) path to the goal in the bridge problem."
here = frozenset(here) | frozenset(['light'])
explored = set() # set of states we have visited
# State will be a (peoplelight_here, peoplelight_there, time_elapsed) tuple
# E.g. ({1, 2, 5, 10, 'light'}, {}, 0)
frontier = [ [(here, frozenset(), 0)] ] # ordered list of paths we have blazed
while frontier:
path = frontier.pop(0)
here1, there1 = state1 = path[-1]
if not here1 or here1 == set(['light']): ## Check for solution when we pull best path
return path
for (state, action) in bsuccessors2(state1).items():
if state not in explored:
here, there, t = state
path2 = path + [action, state]
# Don't check for solution when we extend a path
frontier.append(path2)
frontier.sort(key = elapsed_time)
return Fail

Two changes are here and here. We pull up the test to this point where we check for solution when we pulled the best path off, and we check for our goal only there, and we don’t check for the goal when we’re putting something on the frontier.

It looks like this is a tricky problem. There are lots of cases that we have to take care of. It seems like a good idea to write some more tests. I’ve done that here. I’ve written a few tests. I really should write a lot more. What I want you to do is write at least 3 more tests and run them. I don’t have a way of knowing for sure whether you’ve come up with good ones or not, but go ahead and add at least three more tests to this class of test

### 16. Refactoring Paths

Now, mostly we’re looking for correct code. If you wrote some more tests, you may start to have some more confidence in the code that we have. We’re also considering efficiency to some degree(稍微). It seems like there’s a big problem with the efficiency of the program we have so far. Let me show you one of the issues.

Now we represented states as a (here, there, t) triplet(3个1组). The problem with this is there can be two states that have identical(相同的) here and there’s but differ in the t, and they’re going to be considered different states. Why is that a problem?

Consider this problem. We have two people–one who takes 1 unit(1个单位的时间) to cross the bridge, and one who takes 1000(1000个单位的时间). It seems pretty clear there is an easy solution. The two of them go across together. It takes 1000, but look how we’re going to explore this space.

(程序可能有低效率的问题。

We’re going to start out in the initial state that took time 0, and then we’re going to start adding things to the frontier. Out of all the ways we could cross, the one that adds the least is for the 1 to go across by himself. Now he’s on the other side with the 1 on the other side and the 1000 on the original side. That only took 1 step. Now what’s the fastest thing we can do after that?

We could take 1 more step and go back to the original state. Here we had 1 and we’ll call K(K代表1000) for the 1000 on the left-hand side. Here K was left behind and 1 went over to the right. Here we took one more time unit, and we had 1, K on this side. If we continue taking the fastest step we can, we’ll get to another distinct(有区别的;明显的;不寻常的) state where K is on this side and 1 is on the other side. The flashlight is always going with the 1. We keep on going on like that. We’ll go out 1000 different steps. Each of these will be a distinct state, because this will be the state with time t equals 0. Here time t equals 1, t equals 2, t equals 3.

But really, although it looks like we’re getting different states, in another way of looking at it, we’re always getting the same state. We’re just going back and forth from here to there and back to here and back and back. We’re going around in circles.

In order to recognize that these are in fact the same states, we’re going to have to take t out of our state, and we’re going to have to deal with the t someplace else. We want our representation of a state to be just (here, there).

We’ve got to figure out someplace else to put the t. I’m not sure what the right way to do it is, but why don’t we do it this way?

We have a path, which is (state, action, state,…., action, state)(即path: [s, a, s...]) keeps on alternating(alternate 交替;轮流) between states and actions.

Let’s change that so that the path is a state followed by a tuple of the action and the total time it took after applying that action, then the next state, then the next action and the total time after applying that, and so on(即[s, (a, tot), s...]).

That’ll be our new representation. States are going to look like that, and paths are going to look like that. Now, I want you to write the new successor function for the bridge problem.

We’ll call it bsuccessors2–the “2” just to keep it distinct from the first version. Again it returns a dict of state-action pairs. A state now is just a two-tuple of (here, there), and the here and there are still frozen sets. It’s pretty much the same except we dropped out(drop out 离开,退出;掉出) the time t.

Go ahead and implement that for me.

# -----------------
# User Instructions
#
# write a function, bsuccessors2 that takes a state as input
# and returns a dictionary of {state:action} pairs.
#
# The new representation for a path should be a list of
# [state, (action, total time), state, ... , ], though this
# function will just return {state:action} pairs and will
# ignore total time.
#
# The previous bsuccessors function is included for your reference.

def bsuccessors2(state):
"""Return a dict of {state:action} pairs. A state is a
(here, there) tuple, where here and there are frozensets
of people (indicated by their travel times) and/or the light."""

def bsuccessors(state):
"""Return a dict of {state:action} pairs.  A state is a (here, there, t) tuple,
where here and there are frozensets of people (indicated by their times) and/or
the light, and t is a number indicating the elapsed time."""
here, there, t = state
if 'light' in here:
return dict(((here  - frozenset([a,b, 'light']),
there | frozenset([a, b, 'light']),
t + max(a, b)),
(a, b, '->'))
for a in here if a is not 'light'
for b in here if b is not 'light')
else:
return dict(((here  | frozenset([a,b, 'light']),
there - frozenset([a, b, 'light']),
t + max(a, b)),
(a, b, '<-'))
for a in there if a is not 'light'
for b in there if b is not 'light')

def test():
here1 = frozenset([1, 'light'])
there1 = frozenset([])

here2 = frozenset([1, 2, 'light'])
there2 = frozenset([3])

assert bsuccessors2((here1, there1)) == {
(frozenset([]), frozenset([1, 'light'])): (1, 1, '->')}
assert bsuccessors2((here2, there2)) == {
(frozenset([1]), frozenset(['light', 2, 3])): (2, 2, '->'),
(frozenset([2]), frozenset([1, 3, 'light'])): (1, 1, '->'),
(frozenset([]), frozenset([1, 2, 3, 'light'])): (2, 1, '->')}
return 'tests pass'
print test()

#### 16. Refactoring Paths Solution

def bsuccessors2(state):
here, there = state
if 'light' in here:
return dict(((here  - frozenset([a, b, 'light']),
there | frozenset([a, b, 'light'])),
(a, b, '->'))
for a in here if a is not 'light'
for b in here if b is not 'light')
else:
return dict(((here  | frozenset([a, b, 'light']),
there - frozenset([a, b, 'light'])),
(a, b, '<-'))
for a in there if a is not 'light'
for b in there if b is not 'light')

Here it is–pretty straightforward. I just dropped out the time, and I’m just building up these two components.

### 17. Calculating Costs

(由于在successor函数中去掉了时间，所以需要在别的地方记录时间)
(在别的问题当中，有可能花费的不是时间time，因此这里使用了1个一般化的概念cost)
Now, we got rid of(get rid of 去掉) the times in the successor function, so we’ve got to put them back in someplace. I’m going to generalize(概括,归纳;推广,普及;使一般化) a little bit, and instead of talking about times, I’m going to talk about costs for a path. I’m just thinking of maybe we might want to do some other problems that also have paths in them and that aren’t dealing with optimizing(optimize 使最优化,使尽可能有效) time but are dealing with optimizing some type of cost. What I want you to for me is to define this function path_cost, which takes a path as input and returns the total cost of that path. That’s already stored away. We don’t have to compute anything new. Because we decided that our convention(惯例) for paths was it was going to be stored there. That is, we said that a path is equal to a state followed by an action and a total cost followed by another state, etc. Here I’ve just said, well, if we don’t have any actions there or if it’s the empty path, then do one thing. Otherwise do something else.

def path_cost(path):
"The total cost of a path (which is stored in a tuple with the final action)."
## path = [state, (action, total_cost), state, ...]
if len(path) < 3:
return # ???
else:
# ???

def bcost(action):
"Returns the cost (a number) of an action in the bridge problem."
# An action is an (a, b, arrow) tuple; a and b are times; arrow is a string
a, b, arrow = action
return # ???

Then I also want you to find the bridge cost–bcost is the abbreviation(缩写,省略) I’ll use. That’s the cost of an individual action. An action in this domain is something like 2, 5, arrow to the right. I want you to figure out what’s the cost of that action.

#### 17. Calculating Costs Solution

def path_cost(path):
"The total cost of a path (which is stored in a tuple with the final action)."
## path = [state, (action, total_cost), state, ...]
if len(path) < 3:
return 0
else:
action, total_cost = path[-2]

def bcost(action):
"Returns the cost (a number) of an action in the bridge problem."
# An action is an (a, b, arrow) tuple; a and b are times; arrow is a string
a, b, arrow = action
return max(a, b)

Pretty straightforward(直截了当的). If we don’t have at least 3 elements in the path, that means we don’t have an action there. It’s just an individual state. The cost of that should be 0. Otherwise, we look at the second element from the end. There’s a final state and then there’s a final action. That should be the final action and total cost–this tuple–we just return the total cost. For the bridge cost of an action, it’s just the maximum of the two times.

### 18. Putting it Together

Now we’ve got our new successor function. We know how to deal with costs. Now it’s time to put it all together. It’s a little bit tricky, so I’m not going to ask you to do this as a quiz. If you want to you can pause the video now and do it on your own. You’re certainly welcome to give it a try.

I’m going to go ahead and show it to you. Okay, here it is.

def bridge_problem2(here):
here = frozenset(here) | frozenset(['light'])
explored = set() # set of states we have visited
# state will be a (peoplelight_here, peoplelight_there) tuple
# E.g. ({1, 2, 5, 10, 'light'}, {})
frontier = [ [(here, frozenset())] ] # ordered list of paths we have blazed
while frontier:
path = frontier.pop(0)
here1, there1 = state1 = final_state(path)
if not here1 or (len(here1) == 1 and 'light' in here1):
return path
pcost = path_cost(path)
for (state, action) in bsuccessors2(state1).items():
if state not in explored:
total_cost = pcost + bcost(action)
path2 = path + [(action, total_cost), state]
return Fail

def final_state(path): return path[-1]

"Add path to frontier, replacing costlier path if there is one."
# (This could be done more efficiently.)
# Find if there is an old path to the final state of this path.
old = None
for i,p in enumerate(frontier):
if final_state(p) == final_state(path):
old = i
break
if old is not None and path_cost(frontier[old]) < path_cost(path):
return # Old path was better; do nothing
elif old is not None:
del frontier[old] # Old path was worse; delete it
## Now add the new path and re-sort
frontier.append(path)

The tricky part is just keeping track of the costs and putting them in the right location. Just like before we’re popping paths off the frontier. We’re checking to see if we hit a goal. We’re keeping track of states that we’ve already explored.

But now we’re doing something new. We’re computing the cost of the path that we just popped off, and that’s just pulling the cost out, because we’ve already computed it and stored it in the final action. Then for each of the successors, we figure out the total cost is the cost of the path that we already computed so far plus the bridge cost of the individual action. Total cost so far plus cost for one more action, and then we just throw that into the path. The new path is equal to the old path plus the action total cost tuple plus the state that we end up with. Add that to the frontier and we’re done.

def final_state(path): return path[-1]

I just define this simple one-line function here. The final_state of a path is the last element of the path. I use that there.

def add_to_frontier(frontier, path):
"Add path to frontier, replacing costlier path if there is one."
# (This could be done more efficiently.)
# Find if there is an old path to the final state of this path.
old = None
for i,p in enumerate(frontier):
if final_state(p) == final_state(path):
old = i
break
if old is not None and path_cost(frontier[old]) < path_cost(path):
return # Old path was better; do nothing
elif old is not None:
del frontier[old] # Old path was worse; delete it
## Now add the new path and re-sort
frontier.append(path)

Here is adding to the frontier. Now, it could just be throwing it on there the way we did before, but there’s a tricky part here. The complication that I want to deal with here that we haven’t dealt with before was there may be two different paths that end up in the same state. If that’s the case, we want to choose the best one. We don’t want to get to the state from a path that’s more expensive. We look at see–is there a path that gets to the state that is already on the frontier? If there is, then check to see which one has a better path cost and use that.

### 19. Generalizing

The moral(寓意,教训) of the story is this is tricky(复杂的;微妙的). There are a lot of cases to deal with in getting this kind of search just right, and we made a couple mistakes along the way. I sort of duplicated(复制;重复) the history of the field. There a couple tools we can get to avoid mistakes.

One tool is to write lots of tests, and I just didn’t do enough testing. I wanted to go fast. I wanted to be able to show you some of the interesting ideas. I put in a few tests, but I really need more to have confidence that I’ve got this right.

The second thing is to use, or better yet, reuse existing tools. Every time I do a search, I don’t want to be rewriting this search routine(例行程序) from scratch, because it is tricky and I will make mistakes. Rather I want to write it once or have somebody else write it once and then reuse it. In order to do that, we’re going to have to figure out how to generalize. I’ve written a a function that’s good only for solve the bridge problem through search. I want to write a search function that can solve a wide variety of problems. Then I want to reuse that so that I’m not repeating mistakes, and I’m not introducing new errors.

### 20. Missionaries and Cannibals

missionaries 传教士

cannibals 食人肉者

Let’s do an example to figure out how to do generalization. What do we generalize over? Well, we generalize over problems. So we’re going to need another problem. Rather than have a problem dealing with costs, which we saw were complicated(结构复杂的;混乱的) , let’s just do a problem where we’re finding the shortest path. That is, the least number of steps to a solution.

I’m going to choose a classic problem called the “missionaries and cannibals” problem.

It works like this:

• there’s a river we have to cross, similar to the bridge but this time it’s a river. We’ve got a boat,
• and on this side of the river, there are 6 people. No flashlight, but a boat and 6 people.
• Three of these people are missionaries, and three are cannibals. The goal is to get everybody over to the other side.

What makes it hard is that there are two rules.

1. One, at most 2 in the boat.One person can go in the boat and cross from one side to the other, but it takes either 1 or 2 people to get the boat from one side and to get it back.
1. The other rule is that we don’t want the cannibals eating the missionaries. If we leave more cannibals that missionaries on either side of the river — either on this side or over on that side, then the cannibals are going to gang up and eat the missionaries, and we won’t be able to accomplish(完成;达到(目的)) getting everybody across. We have to shuttle(以短程往复方式运送) them back and forth in such a way that this never occurs.

Now, let’s try to come up with a good representation for state.

• One possibility would be to have a set of missionaries, a set of cannibals, and a boat–let’s call that a Boolean, yes or no, saying what’s on the starting side and leaving out what’s on the other side, because we can figure that out. Given that we know we have three missionaries, If there’s a set of 2 on one side then the other side there must be 1.
• Another possibility is that we have 3 integers: the number of missionaries, the number of cannibals, and the number of boats that are on the starting side. These are all integers.
• Then the third possibility is that we have 6 numbers: the number of missionaries, cannibals, and boats on the first side, and the number of each of those on the other side.
set(M), set(C), B
M, C, B (ints)
M1, C1, B1, M2, C2, B2

It may be subjective which of these is best, but I want you to tell me which of these would sufficient(足够的;充足的;充分的) for representing the state.

#### 20. Missionaries and Cannibals Solution

The answer is that all of them would work. All of them have everything you need to know to solve this specific problem of three missionaries, three cannibals and the boat.

### 21. Generalized State

Now the next question is what representation for states should we use if we want to generalize this problem. So that we’re given an initial state when there can be any number of missionaries, cannibals, and boats on one side of the river and any number on the other. Which of these representations is sufficient(足够的;充足的;充分的) under those conditions?

#### 21. Generalized State Solution

In this case since we don’t know that there’s only three missionaries, we need to have both sets of numbers. We can’t just say there’s two missionaries on the left; therefore, there’s one on the right. We don’t know how many are going to be on the right. So this six-element tuple would do the job where these two wouldn’t.

set(M), set(C), B
M, C, B (ints)
M1, C1, B1, M2, C2, B2  (Right Answer)

### 22. csuccessors

#### 视频下方的补充材料——开始

Oops, there should be a tenth string, ‘<-CC’, in line 12.

Note: After we shot this video, we noticed there is a bug in the solution. The successor function correctly checks to make sure that there are not more cannibals than missionaries in a successor state, but it allows states with a negative numbers of people. We decided not to re-shoot the video, to remind you that everyone makes mistakes, and you should write more tests. This also gives you a chance to write the correct version of the solution yourself.

#### 视频下方的补充材料——结束

def csuccessors(state):
"""Find successors (including ones that result in dining) to this state.
But a state where the cannibals can dine has no successors."""
M1, C1, B1, M2, C2, B2 = state

Now I want you to define the successor function for this problem. We’ll give you a hint that a state is of that form. Return all the successors. The successors should be a dictionary as before. We want to include successor states that result in cannibals being able to eat, but such a state should have no successors itself. In other words, we’re free to generate a successor state that has, say, two cannibals and one missionary in one location, but if we’re given such a state then we should return the empty dictionary of successors.

# -----------------
# User Instructions
#
# Write a function, csuccessors, that takes a state (as defined below)
# as input and returns a dictionary of {state:action} pairs.
#
# A state is a tuple with six entries: (M1, C1, B1, M2, C2, B2), where
# M1 means 'number of missionaries on the left side.'
#
# An action is one of the following ten strings:
#
# 'MM->', 'MC->', 'CC->', 'M->', 'C->', '<-MM', '<-MC', '<-M', '<-C', '<-CC'
# where 'MM->' means two missionaries travel to the right side.
#
# We should generate successor states that include more cannibals than
# missionaries, but such a state should generate no successors.

def csuccessors(state):
"""Find successors (including those that result in dining) to this
state. But a state where the cannibals can dine has no successors."""
M1, C1, B1, M2, C2, B2 = state

def test():
assert csuccessors((2, 2, 1, 0, 0, 0)) == {(2, 1, 0, 0, 1, 1): 'C->',
(1, 2, 0, 1, 0, 1): 'M->',
(0, 2, 0, 2, 0, 1): 'MM->',
(1, 1, 0, 1, 1, 1): 'MC->',
(2, 0, 0, 0, 2, 1): 'CC->'}
assert csuccessors((1, 1, 0, 4, 3, 1)) == {(1, 2, 1, 4, 2, 0): '<-C',
(2, 1, 1, 3, 3, 0): '<-M',
(3, 1, 1, 2, 3, 0): '<-MM',
(1, 3, 1, 4, 1, 0): '<-CC',
(2, 2, 1, 3, 2, 0): '<-MC'}
assert csuccessors((1, 4, 1, 2, 2, 0)) == {}
return 'tests pass'

print test()

#### 22. csuccessors Solution

def csuccessors(state):
"""Find successors (including those that result in dining) to this
state. But a state where the cannibals can dine has no successors."""
M1, C1, B1, M2, C2, B2 = state
## Check for state with no successors
if C1 > M1 > 0 or C2 > M2 > 0:
return {}
items = []
if B1 > 0:
items += [(sub(state,delta), a + '->')
for delta,a in deltas.items()]
if B2 > 0:
items += [(add(state,delta), '<-' + a)
for delta,a in deltas.items()]
return dict(items)

deltas = {(2, 0, 1,     -2,  0, -1):'MM',
(0, 2, 1,      0, -2, -1):'CC',
(1, 1, 1,     -1, -1, -1):'MC',
(1, 0, 1,     -1,  0, -1):'M',
(0, 1, 1,      0, -1, -1):'C'}

"add two vectors, X and Y."
return tuple(x+y for x,y in zip(X, Y))

def sub(X, Y):
"subtract vector Yfrom X."
return tuple(x-y for x,y in zip(X, Y))

Here’s my solution.

deltas = {(2, 0, 1,     -2,  0, -1):'MM',
(0, 2, 1,      0, -2, -1):'CC',
(1, 1, 1,     -1, -1, -1):'MC',
(1, 0, 1,     -1,  0, -1):'M',
(0, 1, 1,      0, -1, -1):'C'}

(关键在于deltas，这里有1个键值对，以第1个键值对(2, 0, 1, -2, 0, -1):'MM'为例，含义是，河的左边有2个传教士、0个食人者、1只船，2个传教士坐着1只船移动到河的右边。

The key to my solution is a list of deltas, of differences in the states that correspond to these moves. What do I mean by that? One thing we can do is send two missionaries from a side with the boat to the other side. That would be a difference of 2 in the missionaries. We would add 2 to one side and subtract 2 from the other side and not change at all the number of cannibals and change the number of boats by 1. Or we could send 2 cannibals, or we could send one of each, or we could send only 1 missionary or cannibal. There are 5 possible moves, basically, depending on where the boat is.

That’s what csuccessors says.

(如果食人者数量比传教士多，就返回{})
First we check for states with no successors. If there are more cannibals than missionaries but there are some missionaries, then they’re going to get eaten, and so we return the empty dictionary as a result.

Otherwise, we’re going to collect up the number of items in our dictionary, and we’re going to do that by going through these deltas and subtracting the deltas from the side where the boat is and adding them in to the other side. We have two directions we can go from left to right, start to the other side, or from the other side back to the original side.

def add(X, Y):
"add two vectors, X and Y."
return tuple(x+y for x,y in zip(X, Y))

def sub(X, Y):
"subtract vector Yfrom X."
return tuple(x-y for x,y in zip(X, Y))

(这里的add(X, Y)sub(X, Y)究竟发生了什么：

>>> tuple(x+y for x,y in zip((1, 2), (10, 18)))
(11, 20)

)
I made use here of vector(矢量) addition and subtraction. I take the current state, which is 6 numbers, and I add or subtract these deltas. That’s what these definitions say. Now, it would nice if this type of vector arithmetic was built into Python, and there are versions called “numeric Python” where you can do that, but here I had to write these functions myself.

### 23. MC Problem

def mc_problem(start=(3, 3, 1, 0, 0, 0), goal=None):
"""Solve the missionaries and cannibals problem.
State is 6 ints: (M1, C1, B1, M2, C2, B2) on the start (1) and other (2) sides.
Find a path that goes from the initial state to the goal state (which, if
not specified, is the state with no people or boats on the start side.)"""

Now let’s write a function to solve the missionary and cannibals problem. It takes a start state. Here’s the normal problem: 3 missionary, 3 cannibals, and 1 boat on the start side. Nothing on the other side, and it takes a goal state. The goal state is not specified. It’s just the opposite of that–3, 3, 1 on the other side. Nothing on the original side. The state is this 6-tuple, and we’re trying to find a path from the initial state to the goal state. In fact, we’re trying to find the path with the least number of steps.

I’m not going to ask you to do this as a quiz. If you’re enthusiastic(狂热的), you can stop the video now and go ahead and solve it on your own, but now I’m going to go ahead and show it to you.

Here’s a solution that looks pretty much like the pouring water problem. We check to see if the goal is None, then we fix up a nice goal. We check to see if we’ve accidentally(偶然地,意外地) already reached the goal at the start. Then we just search for the shortest path.

def mc_problem(start=(3, 3, 1, 0, 0, 0), goal=None):
if goal is None:
goal = (0, 0, 0) + start[:3]
if start == goal:
return [start]
explored = set()
frontier = [ [start] ]
while frontier:
path = frontier.pop(0)
s = path[-1]
for (state, action) in csuccessors(s).items():
if state not in explored:
path2 = path + [action, state]
if state == goal:
return path2
else:
frontier.append(path2)
return Fail

Now let’s generalize(概括,归纳;推广,普及;一般化). Let’s take the specific solver–we had a specific one for the pouring problem and one for the missionaries and cannibals. Let’s generalize them. I’m going to call the generalization “shortest_path_search.” That’s a search for the shortest path that reaches a goal.

Let’s take our inventory. The concepts we have to deal with–we’ve got paths, states, actions, successors. We have a start state. We have a goal. Now let’s figure out how we’re going to represent each of these concepts.

(paths 我们做的是最短路径搜索shortest_path_search，不是最少开销搜索best_cost_search，因此，路径paths中，没有包含时间。)
Paths we already had. I don’t see any reason to change. We have [state, action, state...]. Notice we’re just doing shortest_path_search. We’re not doing best_cost_search. We don’t need to put in the total cost in here. We can just have the action by itself.

(states)
We have states, and here the states can be atomic(原子的;极微的). We don’t have to know anything about the states. In other words, a state can be anything that a particular problem wants to deal with. shortest_path_search doesn’t have to know about that. Now, why is that the case? Because shortest_path_search can interface with states through these two functions– through successors and through the goal function and through the start state. What do I mean by that?

The start state is going to be some atomic state. We don’t know anything more about that. Shortest_path doesn’t know anything about that. When we go to use shortest_path_search for a particular problem, then we have to specify what a state looks like, but shortest_path_search itself doesn’t have to know.

(successor
successor会是1个函数，传入1个state作为输入，返回state-action pairs的1个字典)
All it has to know is that if you give the start state to the successor function– so successor will be a function which takes a state as input and returns a dictionary of state-action pairs. Now, given that initial state that we passed in, we can generate new states and new actions.

(actions)
So the actions also are atomic. shortest_path_search doesn’t have to know anything about the representation other than that this is where they come from–from the successor function.

(goal)
Now, what about the goal? Well, we could specify an exact state that we’re looking for, but sometimes we’re looking for multiple states. We could specify a set of states, but sometimes the set of states is really big. There’s lots of states that satisfy the goal. Instead, let’s have the goal be a function. Its’s a function. When you pass it a state it returns a boolean. True or False? Is that the goal?

With that now we’re ready to specify shortest_path_search. shortest_path_search is going to be a function. It’s going to take some inputs, and it’s going to return a path, and return failure as a path if it can’t find a solution. Now the question is out of this inventory, which of these things do we have to pass into shortest_path_search to allow us to solve a problem? Check all those that apply.

#### 24. Shortest Path Search Solution

The answer is what we have to pass in is the start state– you’ve got to know where you’re starting from, a successor function– you have to know where you can get to from the start state, and a goal function–you have to know when you’re done applying successors. That’s it. We don’t need to pass in any other actions or states or paths, because those can all be generated from these three.

### 25. SPS Function

def shortest_path_search(start, successors, is_goal):
"""Find the shortest path from start state to a state
such that is_goal(state) is true."""

def mc_problem(start=(3, 3, 1, 0, 0, 0), goal=None):
if goal is None:
goal = (0, 0, 0) + start[:3]
if start == goal:
return [start]
explored = set() # set of states we have visited
frontier = [ [start] ] # ordered list of paths we have blazed
while frontier:
path = frontier.pop(0)
s = path[-1]
for (state, action) in csuccessors(s).items():
if state not in explored:
...
...

Let’s see if you can write that function. I’ve left you with the missionary and cannibals problem as sort of a template, but I want you to generalize that to write shortest_path_search, which takes a start state, a successor function, and a is_goal function and returns the shortest path.

# -----------------
# User Instructions
#
# Write a function, shortest_path_search, that generalizes the search algorithm
# that we have been using. This function should have three inputs, a start state,
# a successors function, and an is_goal function.
#
# You can use the solution to mc_problem as a template for constructing your
# shortest_path_search. You can also see the example is_goal and successors
# functions for a simple test problem below.

def shortest_path_search(start, successors, is_goal):
"""Find the shortest path from start state to a state
such that is_goal(state) is true."""

def mc_problem1(start=(3, 3, 1, 0, 0, 0), goal=None):
"""Solve the missionaries and cannibals problem.
State is 6 ints: (M1, C1, B1, M2, C2, B2) on the start (1) and other (2) sides.
Find a path that goes from the initial state to the goal state (which, if
not specified, is the state with no people or boats on the start side."""
if goal is None:
goal = (0, 0, 0) + start[:3]
if start == goal:
return [start]
explored = set() # set of states we have visited
frontier = [ [start] ] # ordered list of paths we have blazed
while frontier:
path = frontier.pop(0)
s = path[-1]
for (state, action) in csuccessors(s).items():
if state not in explored:
path2 = path + [action, state]
if state == goal:
return path2
else:
frontier.append(path2)
return Fail

Fail = []

def csuccessors(state):
"""Find successors (including those that result in dining) to this
state. But a state where the cannibals can dine has no successors."""
M1, C1, B1, M2, C2, B2 = state
## Check for state with no successors
if C1 > M1 > 0 or C2 > M2 > 0:
return {}
items = []
if B1 > 0:
items += [(sub(state, delta), a + '->')
for delta, a in deltas.items()]

if B2 > 0:
items += [(add(state, delta), '<-' + a)
for delta, a in deltas.items()]
return dict(items)

"add two vectors, X and Y."
return tuple(x+y for x,y in zip(X, Y))

def sub(X, Y):
"subtract vector Y from X."
return tuple(x-y for x,y in zip(X, Y))

deltas = {(2, 0, 1,    -2,  0, -1): 'MM',
(0, 2, 1,     0, -2, -1): 'CC',
(1, 1, 1,    -1, -1, -1): 'MC',
(1, 0, 1,    -1,  0, -1): 'M',
(0, 1, 1,     0, -1, -1): 'C'}
Fail = []

# --------------
# Example problem
#
# Let's say the states in an optimization problem are given by integers.
# From a state, i, the only possible successors are i+1 and i-1. Given
# a starting integer, find the shortest path to the integer 8.
#
# This is an overly simple example of when we can use the
# shortest_path_search function. We just need to define the appropriate
# is_goal and successors functions.

def is_goal(state):
if state == 8:
return True
else:
return False

def successors(state):
successors = {state + 1: '->',
state - 1: '<-'}
return successors

#test
assert shortest_path_search(5, successors, is_goal) == [5, '->', 6, '->', 7, '->', 8]    

#### 25. SPS Function Solution

It’s pretty easy. We just took the template that we had for missionaries and cannibals and just replace these general functions–is_goal and successors– put them in here rather than putting in the specific functions for the missionaries and cannibals.

def shortest_path_search(start, sucessors, is_goal):
if is_goal(start):
return [start]
explored = set() # set of states we have visited
frontier = [ [start] ] # ordered list of paths we have blazed
while frontier:
path = frontier.pop(0)
s = path[-1]
for (state, action) in successors(s).items():
if state not in explored:
path2 = path + [action, state]
if is_goal(state):
return path2
else:
fontier.append(path2)
return Fail

### 26. Cleaning up MC Problem

def mc_problem2(start=(3, 3, 1, 0, 0, 0), goal=None):
# your code here if needed
return shortest_path_search() # <== insert arguments here

Now let’s complete the generalization. I’m going to define missionaries and cannibals problem, and we’ll give it a 2 just so we can tell the two versions apart. It takes the same arguments as before. You may need some initialization code to get going. Then I want the body of the function, the main part, to just be a call to shortest_path_search with the appropriate arguments inserted. If you need to you can define other functions outside of here if that’s necessary.

Peter留的问题：

# -----------------
# User Instructions
#
# Write a function, mc_problem2, that solves the missionary and cannibal
# problem by making a call to shortest_path_search. Add any code below
# and change the arguments in the return statement's call to the
# shortest_path_search function.

def mc_problem2(start=(3, 3, 1, 0, 0, 0), goal=None):
# your code here if necessary
return shortest_path_search() # <== insert arguments here

def shortest_path_search(start, successors, is_goal):
"""Find the shortest path from start state to a state
such that is_goal(state) is true."""
if is_goal(start):
return [start]
explored = set()
frontier = [ [start] ]
while frontier:
path = frontier.pop(0)
s = path[-1]
for (state, action) in successors(s).items():
if state not in explored:
path2 = path + [action, state]
if is_goal(state):
return path2
else:
frontier.append(path2)
return Fail
Fail = []

def csuccessors(state):
"""Find successors (including those that result in dining) to this
state. But a state where the cannibals can dine has no successors."""
M1, C1, B1, M2, C2, B2 = state
## Check for state with no successors
if C1 > M1 > 0 or C2 > M2 > 0:
return {}
items = []
if B1 > 0:
items += [(sub(state, delta), a + '->')
for delta, a in deltas.items()]
if B2 > 0:
items += [(add(state, delta), '<-' + a)
for delta, a in deltas.items()]
return dict(items)

"add two vectors, X and Y."
return tuple(x+y for x,y in zip(X, Y))

def sub(X, Y):
"subtract vector Y from X."
return tuple(x-y for x,y in zip(X, Y))

deltas = {(2, 0, 1,    -2,  0, -1): 'MM',
(0, 2, 1,     0, -2, -1): 'CC',
(1, 1, 1,    -1, -1, -1): 'MC',
(1, 0, 1,    -1,  0, -1): 'M',
(0, 1, 1,     0, -1, -1): 'C'}

def test():
assert mc_problem2(start=(1, 1, 1, 0, 0, 0)) == [
(1, 1, 1, 0, 0, 0), 'MC->',
(0, 0, 0, 1, 1, 1)]
assert mc_problem2() == [(3, 3, 1, 0, 0, 0), 'CC->',
(3, 1, 0, 0, 2, 1), '<-C',
(3, 2, 1, 0, 1, 0), 'CC->',
(3, 0, 0, 0, 3, 1), '<-C',
(3, 1, 1, 0, 2, 0), 'MM->',
(1, 1, 0, 2, 2, 1), '<-MC',
(2, 2, 1, 1, 1, 0), 'MM->',
(0, 2, 0, 3, 1, 1), '<-C',
(0, 3, 1, 3, 0, 0), 'CC->',
(0, 1, 0, 3, 2, 1), '<-C',
(0, 2, 1, 3, 1, 0), 'CC->',
(0, 0, 0, 3, 3, 1)]
return 'tests pass'

print test() 

#### 视频下方的补充材料——开始

I think this answer is wrong (as pointed out in the forum by August Hörandl).

One possibility interpretation of the problem is: “I’m going to solve the problem of no people on the start side.” I think that would count as a reasonable interpretation, but then I shouldn’t offer the possibility of having a different goal. Since I do offer a goal parameter, I should honor it. The solution should be:

def mc_problem2(start=(3, 3, 1, 0, 0, 0), goal=None):
"""Solve the missionaries and cannibals problem.
State is 6 ints: (M1, C1, B1, M2, C2, B2) on the start (1) and other (2) sides.
Find a path that goes from the initial state to the goal state (which, if
not specified, is the state with no people or boats on the start side."""
if goal is None:
def goal_fn(state): return state[:3] == (0, 0, 0)
else:
def goal_fn(state): return state == goal
return shortest_path_search(start, csuccessors, goal_fn)

#### 26. Cleaning up MC Problem Solution

Here’s my solution. I had to write some code to fix up the goal if it wasn’t specified. Then it’s just a single call. We call shortest_path_search with the start state we were given, with the csuccessors function that we’ve already defined, and then with a goal test. The goal test is that everybody is gone from the start side of the river. That we define this way.

def mc_problem2(start=(3, 3, 1, 0, 0, 0), goal=None):
if goal is None:
goal = (0, 0, 0) + start[:3]
return shortest_path_search(start, csuccessors, all_gone)

Once again generalize. This time I want to go back to the bridge problem and generalize that. What we’re going to come up with is lower_cost_search, and that’ll take some arguments and again return a path, but let’s figure out what we need. Yes, we’re going to need the start state just like before. We’re going to need a successor function, and we’re going to need a goal function. In addition, we’re going to need one more thing. We’re going to need to know the cost of an action. That’s going to be necessary. It’s going to have to be a parameter to the function. We’ll have the start, the successors, the goal, and the action cost and return from that a path. There’s a notion of action_cost, and as part of our inventory of concepts, there’s also the notion of path cost, but that won’t have to be passed in as a prohibitor(禁止者,阻止者).

def lowest_cost_search(start, successors, is_goal, action_cost):
## your code here

Let’s see if you can define for me lowest_cost_search, which takes these four parameters and should perform the same type of search as we saw previously with the bridge problem.

# -----------------
# User Instructions
#
# Define a function, lowest_cost_search, that is similar to
# shortest_path_search, but also takes into account the cost
# of an action, as defined by the function action_cost(action)
#
# Since we are using this function as a generalized version
# of the bridge problem, all the code necessary to solve that
# problem is included below for your reference.
#
# This code will not run yet. Click submit to see if your code
# is correct.

def lowest_cost_search(start, successors, is_goal, action_cost):
"""Return the lowest cost path, starting from start state,
and considering successors(state) => {state:action,...},
that ends in a state for which is_goal(state) is true,
where the cost of a path is the sum of action costs,
which are given by action_cost(action)."""

def bsuccessors2(state):
"""Return a dict of {state:action} pairs.  A state is a (here, there) tuple,
where here and there are frozensets of people (indicated by their times) and/or
the light."""
here, there = state
if 'light' in here:
return dict(((here  - frozenset([a, b, 'light']),
there | frozenset([a, b, 'light'])),
(a, b, '->'))
for a in here if a is not 'light'
for b in here if b is not 'light')
else:
return dict(((here  | frozenset([a, b, 'light']),
there - frozenset([a, b, 'light'])),
(a, b, '<-'))
for a in there if a is not 'light'
for b in there if b is not 'light')

def path_cost(path):
"The total cost of a path (which is stored in a tuple with the final action)."
if len(path) < 3:
return 0
else:
action, total_cost = path[-2]

def bcost(action):
"Returns the cost (a number) of an action in the bridge problem."
# An action is an (a, b, arrow) tuple; a and b are times; arrow is a string
a, b, arrow = action
return max(a, b)

"Add path to frontier, replacing costlier path if there is one."
# (This could be done more efficiently.)
# Find if there is an old path to the final state of this path.
old = None
for i,p in enumerate(frontier):
if final_state(p) == final_state(path):
old = i
break
if old is not None and path_cost(frontier[old]) < path_cost(path):
return # Old path was better; do nothing
elif old is not None:
del frontier[old] # Old path was worse; delete it
## Now add the new path and re-sort
frontier.append(path)
frontier.sort(key=path_cost)
## Now there is still a problem to deal with.
def bridge_problem2(here):
Fail = []
here = frozenset(here) | frozenset(['light'])
explored = set() # set of states we have visited
# State will be a (peoplelight_here, peoplelight_there) tuple
# E.g. ({1, 2, 5, 10, 'light'}, {})
frontier = [ [(here, frozenset())] ] # ordered list of paths we have blazed
while frontier:
path = frontier.pop(0)
here1, there1 = state1 = final_state(path)
if not here1 or (len(here1)==1 and 'light' in here1):
return path
pcost = path_cost(path)
for (state, action) in bsuccessors2(state1).items():
if state not in explored:
total_cost = pcost + bcost(action)
path2 = path + [(action, total_cost), state]
return Fail

def final_state(path): return path[-1]

#### 27. Lowest Cost Search Solution

def lowest_cost_search(start, successors, is_goal, action_cost):
explored = set()
frontier = [ [start] ]
while frontier:
path = frontier.pop(0)
state1 = final_state(path)
if is_goal(state1):
return path
pcost = path_cost(path)
for (state, action) in successors(state1).items():
if state not in explored:
total_cost = pcost + action_cost(action)
path2 = path + [(action, total_cost), state]
return Fail

Here is my solution, and I got it by copying the code from the bridge problem and just generalizing it. Just replacing the B successors with successors and action_cost and so on.

### 28. Back to Bridge Problem

def bridge_problem3()here:
return lowest_cost_search()

Now let’s go ahead and redefine bridge problem in terms of lowest cost search, thereby generalizing it. In the initialization code you might need here a single call to lowest_cost_search. Any other functions you need to define here.

# -----------------
# User Instructions
#
# In this problem, you will generalize the bridge problem
# by writing a function bridge_problem3, that makes a call
# to lowest_cost_search.

def bridge_problem3(here):
"""Find the fastest (least elapsed time) path to
the goal in the bridge problem."""
return lowest_cost_search() # <== your arguments here

# your code here if necessary

def lowest_cost_search(start, successors, is_goal, action_cost):
"""Return the lowest cost path, starting from start state,
and considering successors(state) => {state:action,...},
that ends in a state for which is_goal(state) is true,
where the cost of a path is the sum of action costs,
which are given by action_cost(action)."""
Fail = []
explored = set() # set of states we have visited
frontier = [ [start] ] # ordered list of paths we have blazed
while frontier:
path = frontier.pop(0)
state1 = final_state(path)
if is_goal(state1):
return path
pcost = path_cost(path)
for (state, action) in successors(state1).items():
if state not in explored:
total_cost = pcost + action_cost(action)
path2 = path + [(action, total_cost), state]
return Fail

def final_state(path): return path[-1]

def path_cost(path):
"The total cost of a path (which is stored in a tuple with the final action)."
if len(path) < 3:
return 0
else:
action, total_cost = path[-2]

"Add path to frontier, replacing costlier path if there is one."
# (This could be done more efficiently.)
# Find if there is an old path to the final state of this path.
old = None
for i,p in enumerate(frontier):
if final_state(p) == final_state(path):
old = i
break
if old is not None and path_cost(frontier[old]) < path_cost(path):
return # Old path was better; do nothing
elif old is not None:
del frontier[old] # Old path was worse; delete it
## Now add the new path and re-sort
frontier.append(path)
frontier.sort(key=path_cost)

def bsuccessors2(state):
"""Return a dict of {state:action} pairs.  A state is a (here, there) tuple,
where here and there are frozensets of people (indicated by their times) and/or
the light."""
here, there = state
if 'light' in here:
return dict(((here  - frozenset([a, b, 'light']),
there | frozenset([a, b, 'light'])),
(a, b, '->'))
for a in here if a is not 'light'
for b in here if b is not 'light')
else:
return dict(((here  | frozenset([a, b, 'light']),
there - frozenset([a, b, 'light'])),
(a, b, '<-'))
for a in there if a is not 'light'
for b in there if b is not 'light')

def bcost(action):
"Returns the cost (a number) of an action in the bridge problem."
# An action is an (a, b, arrow) tuple; a and b are times; arrow is a string
a, b, arrow = action
return max(a, b)

def test():
here = [1, 2, 5, 10]
assert bridge_problem3(here) == [
(frozenset([1, 2, 'light', 10, 5]), frozenset([])),
((2, 1, '->'), 2),
(frozenset([10, 5]), frozenset([1, 2, 'light'])),
((2, 2, '<-'), 4),
(frozenset(['light', 10, 2, 5]), frozenset([1])),
((5, 10, '->'), 14),
(frozenset([2]), frozenset([1, 10, 5, 'light'])),
((1, 1, '<-'), 15),
(frozenset([1, 2, 'light']), frozenset([10, 5])),
((2, 1, '->'), 17),
(frozenset([]), frozenset([1, 10, 2, 5, 'light']))]
return 'test passes'

print test()

#### 视频下方的补充材料——开始

Oops! The correct definiition for all_over should be:

def all_over(state):
here, there = state
return not here or here == set(['light'])

#### 28. Back to Bridge Problem Solution

Here’s my solution. I have to define the start state given a set of people that are on the here side. I have to define the here side and just make sure that we throw in the flashlight there. Then on the other side there’s nobody. lowest_cost_search–starting from the start state, we’ve already defined the successor function. I’m defining a new function to test for a goal. We already defined the cost function. The new function to test for the goal is right here. It says if not here–in other words, if there’s nothing here, if there’s nobody here at all, it’s the empty set, or if here is only the set of the flashlight. That normally wouldn’t happen, but I guess it could happen if the initial problem was there’s no people and just a flashlight. Then you’ve got a solution with doing nothing at all. I just wanted to make sure I covered that trivial case.

def bridge_problem3(here):
"Find the fastest (least elapsed time) path to the goal in the bridge problem."
start = (frozenset(here) | frozenset(['light']), frozenset())
return lowest_cost_search(start, bsuccessors2, all_over, bcost)

def all_over(state):
here, there = state
return not here or here == set('light')

### 29. Summary

Congratulations. You made it to the end of the unit. What have we learned? Well, first of all, some problems require search. What I mean by search is you need to put together a sequence of steps, starting from a start and keep going. You don’t know how many steps it’s going to take, and you’re trying to optimize some factor. There are different kinds of search. We just scratched the surface, believe me. It’s a gigantic(巨大的) field with all sorts of different algorithms and different types of applicability for these different algorithms.

There are many complications we didn’t cover, but we covered two– the shortest_path and the least_cost search. These are two of the most useful. Third, search is really subtle(巧妙的;敏感的). There are lots of possible problems lurking(潜在) in there and many that we didn’t even cover yet. What that means is where there is subtlety, there is likely to be bugs, and there are even some bugs where there is no subtlety. That means we have to be careful.

We have these two tools for combating(防止;减轻) bugs. One is lots of tests, and the second is standardized tools. That is, we work really hard to make a tool that we know works and has got all the bugs out of it, and then we reuse that tool.

Part of that reuse is generalization– to look at a specific problem and say, “Here we solved this specific problem this way,” and to generalize it, to say here’s part of that that I think we’re going to use over and over again. Let’s break that out, and now we’ll have two parts to the solution. We want to be thinking about this specific problem, and we want to be thinking about the more general problem. We want to be allocating(allocate 分配,分派) our work to one or the other appropriately.

Congratulations again. You learned a lot of important concepts. You did a great job in writing some very complex programs.

## 参考文献：

• 本文已收录于以下专栏：

举报原因： 您举报文章：《D o C P》学习笔记（4 - 1）Dealing with complexity through search - Lesson 4 色情 政治 抄袭 广告 招聘 骂人 其他 (最多只允许输入30个字)