了解状态机

by Mark Shead

由马克·希德(Mark Shead)

了解状态机 (Understanding State Machines)

计算机科学概念简介 (An intro to Computer Science concepts)

Computer science enables us to program, but it is possible to do a lot of programming without understanding the underlying computer science concepts.

计算机科学使我们能够编程,但是有可能在不了解基础计算机科学概念的情况下进行大量编程。

This isn’t always a bad thing. When we program, we work at a much higher level of abstraction.

这并不总是一件坏事。 在编程时,我们的工作水平更高。

When we drive a car, we only concern ourselves with two or three pedals, a gearshift, and a steering wheel. You can safely operate a car without having any clear idea of how it works.

当我们开车时,我们只关心两个或三个踏板,变速杆和方向盘。 您无需对汽车的工作原理有任何清楚的了解就可以安全地操作汽车。

However, if you want to operate a car at the very limits of its capabilities, you need to know a lot more about automobiles than just the three pedals, gearshift and steering wheel.

但是,如果您想在汽车的极限功能下操作汽车,则不仅需要三个踏板,变速杆和方向盘,还需要对汽车有更多的了解。

The same is true of programming. A lot of everyday work can be accomplished with little or no understanding of computer science. You don’t need to understand computational theory to build a “Contact Us” form in PHP.

编程也是如此。 很少或根本不了解计算机科学就可以完成许多日常工作。 您无需了解计算理论即可在PHP中构建“联系我们”表格。

However, if you plan to write code that requires serious computation, you will need to understand a bit more about how computation works under the hood.

但是,如果您打算编写需要进行认真计算的代码,则需要更多地了解计算在幕后的工作方式。

The purpose of this article is to provide some fundamental background for computation. If there is interest, I may follow up with some more advanced topics, but right now I want to look at the logic behind one of the simplest abstract computational devices — a finite state machine.

本文的目的是提供一些基本的计算背景。 如果有兴趣,我可能会介绍一些更高级的主题,但是现在,我想看看一种最简单的抽象计算设备- 有限状态机背后的逻辑。

有限状态机 (Finite State Machines)

A finite state machine is a mathematical abstraction used to design algorithms.

有限状态机是用于设计算法的数学抽象。

In simpler terms, a state machine will read a series of inputs. When it reads an input, it will switch to a different state. Each state specifies which state to switch to, for a given input. This sounds complicated but it is really quite simple.

简单来说,状态机将读取一系列输入。 读取输入时,它将切换到其他状态。 对于给定的输入,每个状态都指定要切换到的状态。 这听起来很复杂,但实际上非常简单。

Imagine a device that reads a long piece of paper. For every inch of paper there is a single letter printed on it–either the letter ‘a’ or the letter ‘b’.

想象一下读取一长纸的设备。 每英寸的纸张上都印有一个字母-字母“ a”或字母“ b”。

As the state machine reads each letter, it changes state. Here is a very simple state machine:

当状态机读取每个字母时,它会更改状态。 这是一个非常简单的状态机:

The circles are “states” that the machine can be in. The arrows are the transitions. So, if you are in state s and read an ‘a’, you’ll transition to state q. If you read a ‘b’, you’ll stay in state s.

圆圈是机器可以进入的“ 状态 ”。箭头是过渡 。 因此,如果您处于状态s并读取“ a”,则将转换为状态q 。 如果读“ b”,则将保持状态s

So if we start on s and read the paper tape above from left to right, we will read the ‘a’ and move to state q.

因此,如果我们从s开始并从左到右读取上方的纸带,我们将读取'a'并移至状态q

Then, we’ll read ‘b’ and move back to state s.

然后,我们将读取'b'并返回到状态s

Another ‘b’ will keep us on s, followed by an ‘a’ — which moves us back to the q state. Simple enough, but what’s the point?

另一个“ b”将使我们保持在s上 ,然后是“ a” —使我们回到q状态。 很简单,但是有什么意义呢?

Well, it turns out that you can run a tape through the state machine and tell something about the sequence of letters, by examining the state you end up on.

好吧,事实证明,您可以通过检查状态机来运行状态机上的磁带,并告诉一些有关字母顺序的信息。

In our simple state machine above, if we end in state s, the tape must end with a letter ‘b’. If we end in state q, the tape ends with the letter ‘a’.

在上面的简单状态机中,如果我们以状态s结束,则磁带必须以字母'b'结尾。 如果我们以状态q结尾,则磁带以字母'a'结尾。

This may sound pointless, but there are an awful lot of problems that can be solved with this type of approach. A very simple example would be to determine if a page of HTML contains these tags in this order:

这听起来似乎毫无意义,但是使用这种方法可以解决很多问题。 一个非常简单的示例是确定HTML页面是否按以下顺序包含这些标签:

<html>   <head> </head>   <body> </body> </html>

The state machine can move to a state that shows it has read the html tag, loop until it gets to the head tag, loop until it gets to the head close tag, and so on.

状态机可以进入显示已读取html标签的状态,循环直到到达head标签,循环直到到达head close标签,依此类推。

If it successfully makes it to the final state, then you have those particular tags in the correct order.

如果成功将其设置为最终状态,则您具有正确顺序的那些特定标签。

Finite state machines can also be used to represent many other systems — such as the mechanics of a parking meter, pop machine, automated gas pump, and all kinds of other things.

有限状态机还可以用于表示许多其他系统-例如停车收费表,弹出式机器,自动加油站的机械装置以及各种其他东西。

确定性有限状态机 (Deterministic Finite State Machines)

The state machines we’ve looked at so far are all deterministic state machines. From any state, there is only one transition for any allowed input. In other words, there can’t be two paths leading out of a state when you read the letter ‘a’. At first, this sounds silly to even make this distinction.

到目前为止,我们研究过的状态机都是确定性状态机。 从任何状态开始,任何允许的输入都只有一个过渡。 换句话说,读字母“ a”时,不会有两条路径通向一个状态。 起初,这听起来很愚蠢,甚至无法区分。

What good is a set of decisions if the same input can result in moving to more than one state? You can’t tell a computer, if x == true then execute doSomethingBig or execute doSomethingSmall, can you?

如果相同的输入可以导致转移到多个状态,那么一组决策有什么好处? 您不能告诉计算机, if x == true则执行doSomethingBig或执行doSomethingSmall ,可以吗?

Well, you kind of can with a state machine.

好吧,您可以使用状态机。

The output of a state machine is its final state. It goes through all its processing, and then the final state is read, and then an action is taken. A state machine doesn’t do anything as it moves from state to state.

状态机的输出是其最终状态。 它经过所有处理,然后读取最终状态, 然后采取措施。 状态机在从一个状态转移到另一个状态时不执行任何操作。

It processes, and when it gets to the end, the state is read and something external triggers the desired action (for example, dispensing a soda can). This is an important concept when it comes to non-deterministic finite state machines.

它进行处理,直到结束,读取状态,并由外部触发所需的操作(例如,分配汽水罐)。 对于非确定性有限状态机,这是一个重要的概念。

非确定性有限状态机 (Non-deterministic Finite State Machines)

Non-deterministic finite state machines are finite state machines where a given input from a particular state can lead to more than one different state.

非确定性有限状态机是有限状态机,其中来自特定状态的给定输入可以导致多个不同状态。

For example, let’s say we want to build a finite state machine that can recognize strings of letters that:

例如,假设我们要构建一个可以识别字母字符串的有限状态机:

  • Start with the letter ‘a’

    以字母“ a”开头
  • and are then followed by zero or more occurrences of the letter ‘b’

    然后是零次或多次出现的字母“ b”
  • or, zero or more occurrences of the letter ‘c’

    或者,出现零次或多次出现字母“ c”
  • are terminated by the next letter of the alphabet.

    以字母的下一个字母结尾。

Valid strings would be:

有效字符串为:

  • abbbbbbbbbc

    abbbbbbbbbc
  • abbbc

    阿比
  • acccd

    认证
  • acccccd

    acccccd
  • ac (zero occurrences of b)

    ac(b的零出现)
  • ad (zero occurrences of c)

    广告(零出现c)

So it will recognize the letter ‘a’ followed by zero or more of the same letter of ‘b’ or ‘c’, followed by the next letter of the alphabet.

因此,它将识别字母“ a”,后跟零个或多个相同的“ b”或“ c”字母,再跟下一个字母。

A very simple way to represent this is with a state machine that looks like the one below, where a final state of t means that the string was accepted and matches the pattern.

表示状态的一种非常简单的方法是使用状态机,如下所示,状态机的最终状态t表示字符串已被接受并与模式匹配。

Do you see the problem? From starting point s, we don’t know which path to take. If we read the letter ‘a’, we don’t know whether to go to the state q or r.

看到问题了吗? 从s开始,我们不知道走哪条路。 如果我们读字母“ a”,就不知道要进入状态q还是r。

There are a few ways to solve this problem. One is by backtracking. You simply take all the possible paths, and ignore or back out of the ones where you get stuck.

有几种方法可以解决此问题。 一种是通过回溯。 您只需采取所有可能的路径,然后忽略或退出陷入困境的路径。

This is basically how most chess playing computers work. They look at all the possibilities — and all the possibilities of those possibilities — and choose the path that gives them the greatest number of advantages over their opponent.

基本上,这就是大多数下棋计算机的工作方式。 他们着眼于所有可能性-以及这些可能性的所有可能性-并选择能够给他们提供最大优势的途径。

The other option is to convert the non-deterministic machine into a deterministic machine.

另一种选择是将非确定性机器转换为确定性机器。

One of the interesting attributes of a non-deterministic machine is that there exists an algorithm to turn any non-deterministic machine into a deterministic one. However, it is often much more complicated.

非确定性机器有趣的属性之一是存在一种将任何非确定性机器变成确定性机器的算法。 但是,它通常要复杂得多。

Fortunately for us, the example above is only slightly more complicated. In fact, this one is simple enough that we can transform it into a deterministic machine in our head, without the aid of a formal algorithm.

对我们来说幸运的是,上面的示例只是稍微复杂一点。 实际上,这很简单,我们无需借助正式算法即可将其转换为确定性机器。

The machine below is a deterministic version of the non-deterministic machine above. In the machine below, a final state of t or v is reached by any string that is accepted by the machine.

下面的机器是上面的不确定性机器的确定性版本。 在下面的机器中,机器可接受的任何字符串都会达到tv的最终状态。

The non-deterministic model has four states and six transitions. The deterministic model has six states, ten transitions and two possible final states.

非确定性模型具有四个状态和六个转换。 确定性模型具有六个状态,十个转换和两个可能的最终状态。

That isn’t that much more, but complexity usually grows exponentially. A moderately sized non-deterministic machine can produce an absolutely huge deterministic machine.

不仅如此,但是复杂性通常会成倍增长。 中等大小的不确定性机器可以产生绝对巨大的不确定性机器。

常用表达 (Regular Expressions)

If you have done any type of programming, you’ve probably encountered regular expressions. Regular expressions and finite state machines are functionally equivalent. Anything you can accept or match with a regular expression, can be accepted or matched with a state machine.

如果您完成了任何类型的编程,则可能遇到过正则表达式。 正则表达式和有限状态机在功能上是等效的。 您可以接受或与正则表达式匹配的任何内容,可以被状态机接受或匹配的任何内容。

For example, the pattern described above could be matched with the regular expression: a(b*c|c*d)

例如,上述模式可以与正则表达式匹配: a(b*c|c*d)

Regular expressions and finite state machines also have the same limitations. In particular, they both can only match or accept patterns that can be handled with finite memory.

正则表达式和有限状态机也有相同的限制。 特别是,它们都只能匹配或接受可以使用有限内存处理的模式。

So what type of patterns can’t they match? Let’s say you want to only match strings of ‘a’ and ‘b’, where there are a number of ‘a’s followed by an equal number of ‘b’s. Or n ‘a’s followed by n ‘b’s, where n is some number.

那么它们不能匹配什么类型的模式? 假设您只想匹配字符串“ a”和“ b”,其中包含多个“ a”,后跟相等数量的“ b”。 或n'a后跟n'b ',其中n是某个数字。

Examples would be:

例如:

  • ab

    b
  • aabb

    阿伯
  • aaaaaabbbbbb

    aaaaaabbbbbb
  • aaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbb

    aaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbb

At first, this looks like an easy job for a finite state machine. The problem is that you’ll quickly run out of states, or you’ll have to assume an infinite number of states — at which point it is no longer a finite state machine.

首先,对于有限状态机来说,这看起来像是一件容易的事。 问题在于您将很快用尽状态,或者必须假设无数个状态-此时,它不再是有限状态机。

Let’s say you create a finite state machine that can accept up to 20 ‘a’s followed by 20 ‘b’s. That works fine, until you get a string of 21 ‘a’s followed by 21 ‘b’s — at which point you will need to rewrite your machine to handle a longer string.

假设您创建了一个有限状态机,最多可以接受20个“ a”,然后是20个“ b”。 效果很好,直到您得到一个21'a的字符串,然后是21'b的字符串–此时,您将需要重写计算机以处理更长的字符串。

For any string you can recognize, there is one just a little bit longer that your machine can’t recognize because it runs out of memory.

对于您可以识别的任何字符串,由于机器内存不足,机器无法识别的时间要长一点。

This is known as the Pumping Lemma which basically says: “if your pattern has a section that can be repeated (like the one) above, then the pattern is not regular”.

这就是所谓的抽水引理 ,它基本上说:“如果您的图案中有一个可以重复的部分(如上面的部分),那么该图案就不规则”。

In other words, neither a regular expression nor a finite state machine can be constructed that will recognize all the strings that do match the pattern.

换言之,既没有正则表达式,也不是一个有限状态机可被构造,将识别所有那些匹配模式的字符串。

If you look carefully, you’ll notice that this type of pattern where every ‘a’ has a matching ‘b’, looks very similar to HTML. Within any pair of tags, you may have any number of other matching pairs of tags.

如果仔细看,您会注意到这种模式,其中每个“ a”都有一个匹配的“ b”,看起来与HTML非常相似。 在任何标签对中,您可以具有任意数量的其他匹配标签对。

So, while you may be able to use a regular expression or finite state machine to recognize if a page of HTML has the <html>, <head>; and <body> elements in the correct order, you can’t use a regular expression to tell if an entire HTML page is valid or not — because HTML is not a regular pattern.

因此,虽然您可以使用正则表达式或有限状态机来识别HTML页面是否具有<ht ml >, <h ead> ; 和<body>元素以正确的顺序排列,您不能使用正则表达式来判断整个HTML页面是否有效-因为HTML不是常规模式。

图灵机 (Turing Machines)

So how do you recognize non-regular patterns?

那么,您如何识别非常规模式

There is a theoretical device that is similar to a state machine, called a Turing Machine. It is similar to a finite state machine in that it has a paper strip which it reads. But, a Turing Machine can erase and write on the paper tape.

有一种理论上的设备类似于状态机,称为图灵机。 它与有限状态机相似,因为它具有可读取的纸条。 但是,图灵机可以擦除并在纸带上书写。

Explaining a Turing Machine will take more space that we have here, but there are a few important points relevant to our discussion of finite state machines and regular expressions.

解释图灵机将占用我们这里更多的空间,但是与我们讨论有限状态机和正则表达式有关的几点要点。

Turing Machines are computationally complete — meaning anything that can be computed, can be computed on a Turing Machine.

图灵机在计算上是完整的 -意味着可以在图灵机上进行计算的任何内容。

Since a Turing Machine can write as well as read from the paper tape, it is not limited to a finite number of states. The paper tape can be assumed to be infinite in length. Of course, actual computers don’t have an infinite amount of memory. But, they usually do contain enough memory so you don’t hit the limit for the type of problems they process.

由于图灵机可以在纸带上进行写入和读取,因此它不仅限于有限数量的状态。 可以假定纸带的长度是无限的。 当然,实际的计算机没有无限的内存量。 但是,它们通常确实包含足够的内存,因此您不会达到所处理问题类型的极限。

Turing Machines give us an imaginary mechanical device that lets us visualize and understand how the computational process works. It is particularly useful in understanding the limits of computation. If there is interest I’ll do another article on Turing Machines in the future.

图灵机为我们提供了一种虚拟的机械设备,使我们可以可视化并了解计算过程的工作原理。 在理解计算限制方面特别有用。 如果有兴趣,我将来会在Turing Machines上发表另一篇文章。

为什么这么重要? (Why does this matter?)

So, what’s the point? How is this going to help you create that next PHP form?

那么,有什么意义呢? 这将如何帮助您创建下一个PHP表单?

Regardless of their limitations, state machines are a very central concept to computing. In particular, it is significant that for any non-deterministic state machine you can design, there exists a deterministic state machine that does the same thing.

不管它们的局限性如何,状态机都是计算的核心概念。 特别重要的是,对于可以设计的任何非确定性状态机,都存在一个执行相同操作的确定性状态机。

This is a key point, because it means you can design your algorithm in whichever way is the easiest to think about. Once you have a working algorithm, you can convert it into whatever form is most efficient.

这是关键点,因为这意味着您可以以最容易想到的任何一种方式设计算法。 一旦有了有效的算法,就可以将其转换为最有效的形式。

The understanding that finite state machines and regular expressions are functionally equivalent opens up some incredibly interesting uses for regular expression engines — particularly when it comes to creating business rules that can be changed without recompiling a system.

对有限状态机和正则表达式在功能上等效的理解为正则表达式引擎打开了一些令人难以置信的有趣用途,尤其是在创建无需重新编译系统即可更改的业务规则时。

A foundation in Computer Science allows you to take a problem you don’t know how to solve and reason: “I don’t know how to solve X, but I do know how to solve Y. And I know how to convert a solution for Y into a solution for X. Therefore, I now know how to solve X.”

计算机科学基础可以让您解决一个不知道如何解决和无法解决的问题:“我不知道如何解决X,但是我确实知道如何解决Y。而且我知道如何转换解决方案将Y转换为X的解决方案。因此,我现在知道如何求解X。”

If you like this article, you might enjoy my YouTube channel where I create an occasional video or cartoon about some aspect of creating software. I also have a mailing list for people who would like an occasional email when I produce something new.

如果您喜欢这篇文章,则可能会喜欢我的YouTube频道 ,在该频道上我偶尔会创建有关创建软件某些方面的视频或动画片。 我还有一个邮件列表,供那些想在我制作新内容时偶尔发送电子邮件的人使用。

Originally published at blog.markshead.com on February 11, 2018.

最初于2018年2月11日发布在blog.markshead.com上。

翻译自: https://www.freecodecamp.org/news/state-machines-basics-of-computer-science-d42855debc66/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值