Scala Tutorial中英对照：条件类和模式匹配

最新推荐文章于 2024-04-23 22:45:45 发布

jinxfei

最新推荐文章于 2024-04-23 22:45:45 发布

阅读量2.9k

点赞数

分类专栏： Scala学习文章标签： scala tree function string constants variables

Scala学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

上一篇：Scala Tutorial中英对照：和Java交互，一切皆对象

这个文档，是翻译自官方的Scala Tutorial ，为了防止自己理解错误，保留原文以便参照。

由于篇幅所限，共分成三篇。

博客中格式不好调整，感兴趣的朋友可以下载PDF版本：http://download.csdn.net/source/1742639

－－－

6条件类和模式匹配－ Case classes and pattern matching

树是软件开发中使用频率很高的一种数据结构，例如：解释器和编译器内部使用树来表示代码结构；XML文档是树形结构；还有一些容器（集合，containers）也是基于树的，比如：红黑树（red-black tree，一种自平衡二叉查找树）。

A kind of data structure that often appears in programs is the tree. For example, interpreters and compilers usually represent programs internally as trees; XML documents are trees; and several kinds of containers are based on trees, like red-black trees.

接下来，我们通过一个示例程序，了解在Scala中如何表示和操作树形结构，这个示例将实现非常简单的计算器功能，该计算器可以处理包含加法、变量和整数常量的算术表达式，比如：1 + 2、(x + x) + (7 + y)等。

We will now examine how such trees are represented and manipulated in Scala through a small calculator program. The aim of this program is to manipulate very simple arithmetic expressions composed of sums, integer constants and variables. Two examples of such expressions are 1 + 2 and (x + x) + (7 + y).

首先，我们要决定如何表示这样的表达式。最自然的选择是树形结构，用非叶子节点表示操作符（具体到这个例子，只有加法操作），用叶子节点表示操作数（具体到这个例子是常量和变量）。

We ﬁrst have to decide on a representation for such expressions. The most natural one is the tree, where nodes are operations (here, the addition) and leaves are values (here constants or variables).

如果是在Java中，建立树形结构最常见的做法是：创建一个表示树的抽象类，然后每种类型的节点用一个继承自抽象类的子类来表示。而在函数式编程语言中，则可以使用代数数据类型（algebraic data-type）来达到同样的目的。Scala则提供了一种介于两者之间（类继承和代数数据类型），被称为条件类（case classes）的概念，下面就是用条件类定义树的示例代码：

In Java, such a tree would be represented using an abstract super-class for the trees, and one concrete sub-class per node or leaf. In a functional programming language, one would use an algebraic data-type for the same purpose. Scala provides the concept of case classes which is somewhat in between the two. Here is how they can be used to deﬁne the type of the trees for our example:

abstract class Tree

case class Sum(l: Tree, r: Tree) extends Tree

case class Var(n: String) extends Tree

case class Const(v: Int) extends Tree

上例中的Sum, Var和Const就是条件类，它们与普通类的差异主要体现在如下几个方面：

The fact that classes Sum, Var and Const are declared as case classes means that they differ from standard classes in several respects:

· 新建条件类的实例，无须使用new关键字（比如，可以直接用Const(5)代替new Const(5)来创建实例）。

· the new keyword is not mandatory to create instances of these classes (i.e. one can write Const(5) instead of new Const(5)),

· 自动为构造函数所带的参数创建对应的getter方法（也就是说，如果c是Const的实例，通过c.v即可访问构造函数中的同名参数v的值）

· getter functions are automatically deﬁned for the constructor parameters (i.e. it is possible to get the value of the v constructor parameter of some instance c of class Const just by writing c.v),

· 条件类都默认实现equals和hashCode两个方法，不过这两个方法都是基于实例的结构本身（structure of instance），而不是基于实例中可用于区分的值（identity），这一点和java中Object提供的同名方法的默认实现是基本一致的。

· default deﬁnitions for methods equals and hashCode are provided, which work on the structure of the instances and not on their identity,

· 条件类还提供了一个默认的toString方法，能够以源码形式（source form）打印实例的值(比如，表达式x+1会被打印成Sum(Var(x),Const(1))，这个打印结果，和源代码中创建表达式结构树的那段代码完全一致)。

· a default deﬁnition for method toString is provided, and prints the value in a “source form” (e.g. the tree for expression x+1 prints as Sum(Var(x),Const(1))),

· 条件类的实例可以通过模式匹配（pattern matching）进行分解（decompose），接下来会详细介绍。

· instances of these classes can be decomposed through pattern matching as we will see below.

既然我们已经定义了用于表示算术表达式的数据结构，接下来我们可以定义作用在这些数据结构上的操作。首先，我们定义一个在特定环境（environment，上下文）中对表达式进行求值的函数，其中环境的作用是为了确定表达式中的变量的取值。例如：有一个环境，对变量x的赋值为5，我们记为：{x → 5}，那么，在这个环境上求x + 1的值，得到的结果为6。

Now that we have deﬁned the data-type to represent our arithmetic expressions, we can start deﬁning operations to manipulate them. We will start with a function to evaluate an expression in some environment. The aim of the environment is to give values to variables. For example, the expression x + 1 evaluated in an environment which associates the value 5 to variable x, written {x → 5}, gives 6 as result.

在程序中，环境也需要一种合理的方式来表示。可以使用哈希表（hash table）之类的数据结构，也可以直接使用函数（functions）！实际上，环境就是一个给变量赋予特定值的函数。上面提到的环境：{x → 5}，在Scala中可以写成：

We therefore have to ﬁnd a way to represent environments. We could of course use some associative data-structure like a hash table, but we can also directly use functions! An environment is really nothing more than a function which associates a value to a (variable) name. The environment {x → 5} given above can simply be written as follows in Scala:

{ case "x" => 5 }

上面这一行代码定义了一个函数，如果给该函数传入一个字符串"x"作为参数，则函数返回整数5，否则，将抛出异常。

This notation deﬁnes a function which, when given the string "x" as argument, returns the integer 5, and fails with an exception otherwise.

在写表达式求值函数之前，我们还要对环境的类型（type of the environments）进行命名。虽然在程序中全都使用String => Int这种写法也可以的，但给环境起名后，可以简化代码，并使得将来的修改更加方便（这里说的环境命名，简单的理解就是宏，或者说是自定义类型）。在Scala中，使用如下代码来完成命名：

Before writing the evaluation function, let us give a name to the type of the environments. We could of course always use the type String => Int for environments, but it simpliﬁes the program if we introduce a name for this type, and makes future changes easier. This is accomplished in Scala with the following notation:

type Environment = String => Int

此后，类型名Environment可以作为“从String转成Int”这一类函数的别名。

From then on, the type Environment can be used as an alias of the type of functions from String to Int.

现在，我们来写求值函数。求值函数的实现思路很直观：两个表达式之和（sum），等于分别对两个表达式求值然后求和；变量的值直接从环境中获取；常量的值等于常量本身。在Scala中描述这个概念并不困难：

We can now give the deﬁnition of the evaluation function. Conceptually, it is very simple: the value of a sum of two expressions is simply the sum of the value of these expressions; the value of a variable is obtained directly from the environment; and the value of a constant is the constant itself. Expressing this in Scala is not more difﬁcult:

def eval(t: Tree, env: Environment): Int = t match {

case Sum(l, r) => eval(l, env) + eval(r, env)

case Var(n) => env(n)

case Const(v) => v

}

求值函数的工作原理是对树t上的结点进行模式匹配，下面是对匹配过程的详细描述（实际上是递归）：

This evaluation function works by performing pattern matching on the tree t. Intuitively, the meaning of the above deﬁnition should be clear:

1. 求值函数首先检查树t是不是一个求和（Sum），如果是，则把t的左子树和右子树分别绑定到两个新的变量l和r上，然后对箭头右边的表达式进行运算（实际上就是分别求左右子树的值然后相加，这是一个递归）。箭头右边的表达式可以使用箭头左边绑定的变量，也就是l 和 r。it ﬁrst checks if the tree t is a Sum, and if it is, it binds the left sub-tree to a new variable called l and the right sub-tree to a variable called r, and then proceeds with the evaluation of the expression following the arrow; this expression can (and does) make use of the variables bound by the pattern appearing on the left of the arrow, i.e. l and r,

2. 如果第一个检查不满足，也就是说，树t不是Sum，接下来就要检查t是不是一个变量Var；如果是，则Var中包含的名字被绑定到变量n上，然后继续执行箭头右边的逻辑。if the ﬁrst check does not succeed, that is if the tree is not a Sum, it goes on and checks if t is a Var; if it is, it binds the name contained in the Var node to a variable n and proceeds with the right-hand expression,

3. 如果第二个检查也不满足，那意味着树t既不是Sum，也不是Var，那就进一步检查t是不是常量Const。如果是，则将常量所包含的值赋给变量v，然后继续执行箭头右边的逻辑。if the second check also fails, that is if t is neither a Sum nor a Var, it checks if it is a Const, and if it is, it binds the value contained in the Const node to a variable v and proceeds with the right-hand side,

4. 最后，如果以上所有的检查都不满足，程序会抛出异常，表明对表达式做模式匹配时产生了错误。这种情况，在本例中，只有声明了更多Tree的子类，却没有增加对应的模式匹配条件时，才会出现。ﬁnally, if all checks fail, an exception is raised to signal the failure of the pattern matching expression; this could happen here only if more sub-classes of Tree were declared.

通过上例，我们可以看到，模式匹配的过程，实际上就是把一个值（value）和一系列的模式进行比对，如果能够匹配上，则从值（value）中取出有用的部件（parts）进行命名，然后用这些命名的部件（作为参数）来驱动另一段代码的执行。

We see that the basic idea of pattern matching is to attempt to match a value to a series of patterns, and as soon as a pattern matches, extract and name various parts of the value, to ﬁnally evaluate some code which typically makes use of these named parts.

一个有经验（seasoned，老练的）的面向对象程序员可能会问：为什么不把eval定义成类Tree的成员方法？事实上，这么做也行，因为在Scala中，条件类和普通类一样，都可以定义方法。不过，“模式匹配”和“类方法”除了编程风格的差异，也各有利弊，决策者需要根据程序的扩展性需求做出权衡和选择：

A seasoned object-oriented programmer might wonder why we did not deﬁne eval as a method of class Tree and its subclasses. We could have done it actually, since Scala allows method deﬁnitions in case classes just like in normal classes. Deciding whether to use pattern matching or methods is therefore a matter of taste, but it also has important implications on extensibility:

· 使用类方法，添加一种新的节点类型比较简单，因为只需要增加一个Tree的子类即可。但是，要在树上增加一种新的操作则比较麻烦，因为这需要修改Tree的所有子类。

· when using methods, it is easy to add a new kind of node as this can be done just by deﬁning the sub-class of Tree for it; on the other hand, adding a new operation to manipulate the tree is tedious, as it requires modiﬁcations to all sub-classes of Tree,

· 使用模式匹配，情况则刚好相反：增加一种新的节点类型需要修改所有作用在树上的模式匹配函数；而增加新的操作则比较简单，只需要增加一个新的函数即可。

· when using pattern matching, the situation is reversed: adding a new kind of node requires the modiﬁcation of all functions which do pattern matching on the tree, to take the new node into account; on the other hand, adding a new operation is easy, by just deﬁning it as an independent function.

为了更深入的探索模式匹配，我们要在算术表达式上定义一个新的操作：对符号求导（symbolic derivation，导数）。该操作的规则如下：

To explore pattern matching further, let us deﬁne another operation on arithmetic expressions: symbolic derivation. The reader might remember the following rules regarding this operation:

1. 对和求导，等于分别求导的和。the derivative of a sum is the sum of the derivatives,

2. 对变量v求导，有两种情况：如果变量v正好是用于求导的符号，则返回1，否则返回0。the derivative of some variable v is one if v is the variable relative to which the derivation takes place, and zero otherwise,

3. 常量求导恒为0。the derivative of a constant is zero.

这几条规则几乎可以直接翻译成Scala的代码：

These rules can be translated almost literally into Scala code, to obtain the following deﬁnition:

def derive(t: Tree, v: String): Tree = t match {

case Sum(l, r) => Sum(derive(l, v), derive(r, v))

case Var(n) if (v == n) => Const(1)

case _ => Const(0)

}

通过求导函数的定义，又引出了两个跟模式匹配相关的知识点。第一、case语句可以带一个guard，它由关键字if和紧随其后的表达式组成。guard的作用是对case匹配的模式进行二次限定，只有if后面的表达式为true时，才允许匹配成功。在本例中，guard保证当且仅当被求导的变量名n等于当前求导符号v时，才返回常量1。第二、模式匹配中可以使用通配符（记为_，下划线）来匹配任意值（相当于java中switch语句的default分支）。

This function introduces two new concepts related to pattern matching. First of all, the case expression for variables has a guard, an expression following the if keyword. This guard prevents pattern matching from succeeding unless its expression is true. Here it is used to make sure that we return the constant 1 only if the name of the variable being derived is the same as the derivation variable v. The second new feature of pattern matching used here is the wild-card, written _, which is a pattern matching any value, without giving it a name.

模式匹配的功能非常强大，但限于本文的长度和定位，我们将不再做太多深入的讨论，接下来，我们还是通过一个实例，来看看前面定义的两个函数如何使用吧。为此，我们编写一个的main函数，在函数中，先创建一个表达式：(x + x) + (7 + y)，然后在环境{x → 5, y → 7}上求表达式的值，最后分别求表达式相对于x和y的导数。

We did not explore the whole power of pattern matching yet, but we will stop here in order to keep this document short. We still want to see how the two functions above perform on a real example. For that purpose, let’s write a simple main function which performs several operations on the expression (x + x) + (7 + y): it ﬁrst computes its value in the environment {x → 5, y → 7}, then computes its derivative relative to x and then y.

def main(args: Array[String]) {

val exp: Tree = Sum(Sum(Var("x"),Var("x")),Sum(Const(7),Var("y")))

val env: Environment = { case "x" => 5 case "y" => 7 }

println("Expression: " + exp)

println("Evaluation with x=5, y=7: " + eval(exp, env))

println("Derivative relative to x:/n " + derive(exp, "x"))

println("Derivative relative to y:/n " + derive(exp, "y"))

}

执行这段程序，我们得到的输出如下：

Executing this program, we get the expected output:

Expression: Sum(Sum(Var(x),Var(x)),Sum(Const(7),Var(y))) Evaluation with x=5, y=7: 24

Derivative relative to x:

Sum(Sum(Const(1),Const(1)),Sum(Const(0),Const(0))) Derivative relative to y:

Sum(Sum(Const(0),Const(0)),Sum(Const(0),Const(1)))

仔细观察程序的输出，我们发现，如果把求导结果化简（simplification）后再展示给用户会更好。使用模式匹配来定义一个化简函数是很有意思（同时也很棘手）的事情，读者可以自己做做练习。

By examining the output, we see that the result of the derivative should be simpliﬁed before being presented to the user. Deﬁning a basic simpliﬁcation function using pattern matching is an interesting (but surprisingly tricky) problem, left as an exercise for the reader.

jinxfei

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
1
评论
Scala Tutorial中英对照：条件类和模式匹配

上一篇：Scala Tutorial中英对照：和Java交互，一切皆对象这个文档，是翻译自官方的Scala Tutorial ，为了防止自己理解错误，保留原文以便参照。由于篇幅所限，共分成三篇。博客中格式不好调整，感兴趣的朋友可以下载PDF版本：http://download.csdn.net/source/1742639－－－6条件类和模式匹配－ Case classes
复制链接

扫一扫