javascript解析
by Shalvah
由Shalvah
使用JavaScript解析数学表达式 (Parsing math expressions with JavaScript)
A while ago, I wrote about tokenizing a math expression, with Javascript as the language of choice. The tokenizer I built in that article was the first component of my quest to render and solve math expressions using Javascript, or any other language. In this article, I’ll walk through how to build the next component: the parser.
前一阵子,我写过关于用Java语言作为首选语言对数学表达式进行标记的方法。 我在那篇文章中构建的令牌生成器是我寻求使用Javascript或任何其他语言呈现和求解数学表达式的第一个组件。 在本文中,我将逐步介绍如何构建下一个组件:解析器。
What is the job of the parser? Quite simple. It parses the expression. (Duh.) Okay, actually, Wikipedia has a good answer:
解析器的工作是什么? 非常简单。 它解析表达式。 (Du)好吧,实际上, 维基百科有一个很好的答案:
A parser is a software component that takes input data (frequently text) and builds a data structure — often some kind of parse tree, abstract syntax tree or other hierarchical structure — giving a structural representation of the input, checking for correct syntax in the process. The parsing may be preceded or followed by other steps, or these may be combined into a single step. The parser is often preceded by a separate lexical analyser, which creates tokens from the sequence of input characters
解析器是一种软件组件,它接收输入数据(通常是文本)并构建数据结构(通常为某种解析树,抽象语法树或其他层次结构),从而提供输入的结构表示,并在过程中检查语法是否正确。 解析可以在其他步骤之前或之后,或者可以将这些步骤组合为一个步骤。 解析器通常在前面是一个单独的词法分析器,该词法分析器根据输入字符序列创建标记
So, in essence, this is what we’re trying to achieve:
因此,从本质上讲,这就是我们要实现的目标:
math expression => [parser] => some data structure (we'll get to this in a bit)
Let’s skip ahead a bit: “… The parser is often preceded by a separate lexical analyzer, which creates tokens from the sequence of input characters”. This is talking about the tokenizer we built earlier. So, our parser won’t be receiving the raw math expression, but rather an array of tokens. So now, we have:
让我们先略过一点:“…解析器通常前面有一个单独的词法分析器,该词法分析器根据输入字符序列创建标记”。 这是在谈论我们之前构建的令牌生成器。 因此,我们的解析器将不会接收原始的数学表达式,而只会接收令牌的数组。 所以现在,我们有:
math expression => [tokenizer] => list of tokens => [parser] => some data structure
For the tokenizer, we had to come up with the algorithm manually. For the parser, we’ll be implementing an already existing algorithm, the Shunting-yard algorithm. Remember the “some data structure” above? With this algorithm, our parser can give us a data structure called an Abstract Syntax Tree (AST) or an alternative representation of the expression, known as Reverse Polish Notation (RPN).
对于令牌生成器,我们必须手动提出算法。 对于解析器,我们将实现一个已经存在的算法, Shunting-yard算法。 还记得上面的“某些数据结构”吗? 使用此算法,解析器可以为我们提供称为抽象语法树(AST)或该表达式的替代表示的数据结构,称为反向波兰表示法(RPN)。
反向波兰符号 (Reverse Polish Notation)
I’ll start with RPN. Again from Wikipedia, RPN is “a mathematical notation in which every operator follows all of its operands”. Instead of having, say, 3+4, RPN would be 3 4 +. Weird, I know. But the rule is that the operator has to come after all its operands.
我将从RPN开始。 再次来自维基百科 ,RPN是“一种数学符号,其中每个运算符都遵循其所有操作数 ”。 RPN不是3 + 4,而是3 4+。 很奇怪,我知道。 但规则是操作员来其所有操作数之后 。
Keep that rule in mind as we take a look at some more complex examples. Also remember that an operand for one operation can be the result of an earlier operation).
当我们看一些更复杂的示例时,请记住该规则。 还请记住,一个操作的操作数可以是更早操作的结果)。
Algebraic: 3 - 4 RPN: 3 4 -
Algebraic: 3 - 4 + 5 RPN: 3 4 - 5 +
Algebraic: 2^3 RPN: 2 3 ^
Algebraic: 5 + ((1 + 2) × 4) − 3 RPN: 5 1 2 + 4 * + 3 -
Algebraic: sin(45) RPN: 45 sin
Algebraic: tan(x^2 + 2*x + 6) RPN: x 2 ^ 2 x * + 6 + tan
Because the operator has to come after its operands, RPN is also known as postfix notation, and our “regular” algebraic notation is call