4目录:模式匹配
4.1 简单匹配
val bools = Seq(true,false)
for (bool <- bools) {
bool match {
case true => println("Got heads",bool)
case false => println("Got tails",bool)
}
}
// for 推导式的替代写法
for (bool <- bools) {
val which = if(bool) "head" else "tails"
println("Got" + which,bool)
}
/*
output:
[root@master scalar]# scala match-boolean.sc
(Got heads,true)
(Got tails,false)
如果尝试,将case false => println("Got tails",bool) 这条语句注释,scala 编译器会报scala.MatchError错误。bools有两个值,但是match 只匹配了true没有匹配false,所以会报错,match需要匹配bools中所有的值
scala> val bool = Seq(true,false)
bool: Seq[Boolean] = List(true, false)
*/
4.2 match中的值、变量和类型
for {
x <- Seq(1,2,2.7,"one","two",'four)
} {
val str = x match {
case 1 => "int 1"
case i: Int => "other int: " + i
case i: String => "other string: " + i
case d: Double => "a double: " + x
case "one" => "string one"
case s: String => "other string: " + s
case unexpected => "unexpected value: " + unexpected
}
println(str)
}
/*
output:
int 1
other int: 2
a double: 2.7
string one
other string: two
unexpected value: 'four
*/
// 改装上面的代码
for {
x <- Seq(1,2,2.7,"one","two",'four)
} {
val str = x match {
case 1 => "int 1"
// case i: Int => "other int: " + i
case s: String => "other string: " + s
case d: Double => "a double: " + x
// case "one" => "string one"
case i: Int => "other int: " + i
// case s: String => "other string: " + s
case unexpected => "unexpected value: " + unexpected
}
println(str)
}
/*
output:
int 1
other int: 2
a double: 2.7
other string: one
other string: two
unexpected value: 'four
*/
/*
从改装的来看 x <- Seq(1,2,2.7,"one","two",'four),x 取值按原始顺序1,2,2.7,"one","two",'four,总共有6个元素,需要匹配6次,但并不一定需要6条case语句,一条case 可以匹配多个元素
第一次匹配:元素1和所有的case语句进行匹配,只要匹配成功就不在往下匹配,case 1 => "int 1" 这条匹配成功,不在匹配 case i: Int => "other int: " + i
第二次匹配:元素2和所有的case语句进行匹配, case i: Int => "other int: " + i 这条case 语句匹配成功
第三次匹配:元素2.7和所有的case语句进行匹配,case d: Double => "a double: " + x 这条case 语句匹配成功
第四/五次匹配:元素"one","two" 这两个元素和 case i: String => "other string: " + i 这条匹配,所以可以将一个case语句匹配两个元素,从而也说明了,并不是有多少个待匹配的元素就需要多少天case语句
第六次匹配:'four 这是一个符号字面量,只和最后一条case语句匹配 case unexpected => "unexpected value: " + unexpected。这条语是万能语句,只要不与前面匹配的都可以与这个匹配
*/
// 再次改装上面的代码,使用"_" 代替变量"i","s","d" 和 "unexpected"
cat match-variable2.sc
for {
x <- Seq(1,2,2.7,"one","two",'four)
} {
val str = x match {
case 1 => "int 1"
case _: Int => "other int: " + x
case _: Double => "a double: " + x
case "one" => "string one"
case _: String => "other string: " + x
case _ => "unexpected value: " + x
}
println(str)
}
- 注意:除了偏函数,所有的match语句必须完全覆盖所有的输入,当输入类型为Any,case _ 或者 case some_name 作为默认子句
- 匹配是按顺序进行的,因此具体的子句应该出现在宽泛的子句之前。否则具体的语句将不可能有机会匹配上,所以,默认子句必须是最后一个子句
- case 语句,编译器假定以大写字母开头的为类型名,以小写字母开头的为变量名。这是case 子句的一个坑
- case 子句中,如果需要引用之前已经定义的变量是,应使用反引号将其包围
def checkY(y: Int) = {
for { x <- Seq(99,100,101)
} {
val str = x match {
// case y => "found y!" 编译错误,不符合scala语法规范
case `y` => "found y!"
case i: Int => "int: " + i
}
println(str)
}
}
checkY(100)
/*
output:
[root@master scalar]# scala match-surprise.sc
int: 99
found y!
int: 101
*/
这里解释下 case y => “fount y!” 这句发生错误的原因
case y 的含义其实是:匹配所有输入(由于这里没有类型注解),并将其赋值给新的变量y,这里的y没有被解析为方法参数y,因此,事实上我们将一个默认的、匹配一切的语句写在了第一行
for {
x <- Seq(1,2,2.7,"one","two",'four)
} {
val str = x match {
case _: Int | _: Double => "a number: " + x
case "one" => "string one"
case _: String => "other string: " + x
case _ => "unexpected value: " + x
}
println(str)
}
/*
output:
a number: 1
a number: 2
a number: 2.7
string one
other string: two
unexpected value: 'four
*/
// case 子句支持逻辑或操作,使用 "|" 方法即可。这样Int 和 Double 类型的值都匹配第一条子句
4.3 序列的匹配
- Seq 序列是具体集合类型的父类型,这些集合支持以确定的顺序遍历其元素,如List 和 Vector
- Seq的行为很符合链表的定义,因为在链表中,每个头结点除了含有自身的值以外,还指向链表的尾部(即链表剩下的元素),从而创建了一种层级结构类似以下4个节点所组成的序列。tail好比指向链表尾部的指针。
(node1,(node2,(node3,(node4,(end))))
val nonEmptySeq = Seq(1,2,3,4,5)
val emptySeq = Seq.empty[Int]
val nonEmptyList = List(1,2,3,4,5)
val emptyList = Nil
val nonEmptyVector = Vector(1,2,3,4,5)
val emptyVector = Vector.empty[Int]
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)
val emptyMap = Map.empty[String,Int]
def seqToString[T](seq: Seq[T]): String = seq match {
// head,tail 是两个变量名
case head +: tail => s"$head +: " + seqToString(tail)
case Nil => "Nil"
}
for (seq <- Seq (
nonEmptySeq,emptySeq,nonEmptyList,emptyList,nonEmptyVector,emptyVector,nonEmptyMap.toSeq,emptyMap.toSeq)) {
println(seqToString(seq))
}
/*
output:
[root@master scalar]# scala match-seq.sc
1 +: 2 +: 3 +: 4 +: 5 +: Nil
Nil
1 +: 2 +: 3 +: 4 +: 5 +: Nil
Nil
1 +: 2 +: 3 +: 4 +: 5 +: Nil
Nil
(one,1) +: (two,2) +: (three,3) +: Nil
Nil
scala> val l = List(1,2,3,4,5)
l: List[Int] = List(1, 2, 3, 4, 5)
head 序列中的第1个元素
scala> l.head
res7: Int = 1
tail 序列中除了第1个元素外,所有其他的元素
scala> l.tail
res8: List[Int] = List(2, 3, 4, 5)
MaptoSeq: 注意Map的定义使用(),不能使用{},这与Python是有区别的
scala> val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)
nonEmptyMap: scala.collection.immutable.Map[String,Int] = Map(one -> 1, two -> 2, three -> 3)
scala> nonEmptyMap.toSeq
res0: Seq[(String, Int)] = ArrayBuffer((one,1), (two,2), (three,3))
递归函数解析
Seq(1,2,3,4,5) case head +: tail => s"head +: " + seqToString(tail)
tail = List(2,3,4,5) ---> 1 +: + seqToString(List(2,3,4,5))
Seq(2,3,4,5) case head +: tail => s"head +: " + seqToString(tail)
tail = List(3,4,5) ----> 1+: 2 +: + seqToString(List(3,4,5))
Seq(3,4,5) case head +: tail => s"head +: " + seqToString(tail)
tail = List(4,5) ----> 1 +:2 + :3 + seqToString(List(4,5))
Seq(4,5) case head +: tail => s"head +: " + seqToString(tail)
tail = List(5) ----> 1 +:2 + :3 + :4 + seqToString(List(5))
Seq(5) case head +: tail => s"head +: " + seqToString(tail)
tail = List() ----> 1 +:2 + :3 + :4 + :5 + seqToString(List())
List() case Nil => "Nil" Seq.tail 为空后就匹配第二个case,到此Seq中所有的元素都匹配
*/
修改代码使输出显得有层次结构
val nonEmptySeq = Seq(1,2,3,4,5)
val emptySeq = Seq.empty[Int]
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)
def seqToString2[T](seq: Seq[T]): String = seq match {
case head +: tail => s"($head +: ${seqToString2(tail)})"
case Nil => "(Nil)"
}
for (seq <- Seq(nonEmptySeq,emptySeq,nonEmptyMap.toSeq)) {
println(seqToString2(seq))
}
/*
output:
[root@spark1 scala]# scala match-seq-paren.sc
(1 +: (2 +: (3 +: (4 +: (5 +: (Nil))))))
(Nil)
((one,1) +: ((two,2) +: ((three,3) +: (Nil))))
*/
// scala 2.10 之前处理List的另一种方法。当然在scala 2.11.8 上也是可以这样使用的。
val nonEmptyList = List(1,2,3,4,5)
val emptyList = Nil
def listToString[T](list: List[T]): String = list match {
case head :: tail => s"($head :: ${listToString(tail)})"
case Nil => "(Nil)"
}
for (l <- List(nonEmptyList,emptyList)) { println(listToString(l)) }
// 将 :+ 改成 ::
/*
output:
(1 :: (2 :: (3 :: (4 :: (5 :: (Nil))))))
(Nil)
*/
// 使用输出的内容重新构造一个集合
scala> val s1 = (1 :: (2 :: (3 :: (4 :: (5 :: (Nil))))))
s1: List[Int] = List(1, 2, 3, 4, 5)
scala> val l = (1 :: (2 :: (3 :: (4 :: (5 :: (Nil))))))
l: List[Int] = List(1, 2, 3, 4, 5)
scala> val s2 = (("one",1) +: (("two",2) +: (("three",3) +: Nil)))
s2: List[(String, Int)] = List((one,1), (two,2), (three,3))
// Map.apply 工厂方法需要一个可变参数列表,其中的参数是由2个元素组成的元组。
// 所以为了使用序列s2 来构造Map,我们只能:_* 惯用法来将序列s2转化为可变参数列表
scala> val m = Map(s2 :_*)
m: scala.collection.immutable.Map[String,Int] = Map(one -> 1, two -> 2, three -> 3)
4.4 元组的匹配
val langs = Seq(
("Scala","Martin","Odersky"),
("Clojure","Rich","Hickey"),
("Lisp","John","McCarthy"))
for (tuple <- langs) {
tuple match {
// 忽略不需要的值
case ("Scala",_,_) => println("Found Scala")
// lang,frist,last 只是3个变量而已,可以使用任意的字符串代替,
// 在这里lang = "Clojure" or "Lisp", frist = "Rich" or "John", last = "Hickey" or "McCarthy"
case (lang,first,last) =>
println(s"Found other lanuage : $lang ($first,$last)")
}
}
/*
output:
[root@master scalar]# scala match-tuple.sc
Found Scala
Found other lanuage : Clojure (Rich,Hickey)
Found other lanuage : Lisp (John,McCarthy)
*/
4.5 case 中的guard 语句
for ( i <- Seq(1,2,3,4)) {
i match {
// if 两边的表达式不需要括号
case _ if i %2 ==0 => println(s"even: $i")
case _ => println(s"odd: $i")
}
}
4.6 case 类的匹配
case class Address(street: String,city: String,country: String)
case class Person(name: String,age: Int,address: Address)
val alice = Person("Alice",25,Address("1 Scala Lane","Chicago","USA"))
val bob = Person("Bob",29,Address("2 Java Ave","Miami","USA"))
val charlie = Person("Charlie",32,Address("3 Python Ct","Boston","USA"))
for (persion <- Seq(alice,bob,charlie)) {
persion match {
case Person("Alice",25,Address(_,"Chicago",_)) => println("Hi Alice!")
case Person("Bob",29,Address("2 Java Ave","Miami","USA")) => println("Hi Bob")
case Person(name,age,_) => println(s"Who are you,$age year-old persion name $name?")
}
}
/*
output:
[root@master scalar]# vim match-deep.sc
[root@master scalar]# scala match-deep.sc
Hi Alice!
Hi Bob
Who are you,32 year-old persion name Charlie?
*/
val itemsCosts = Seq(("Pencil",0.52),("Paper",1.35),("Notebook",2.43))
// zipWithIndex 返回元组的形式((name,cost),index)
val itemsCostsIndices = itemsCosts.zipWithIndex
for (itemCostIndex <- itemsCostsIndices) {
itemCostIndex match {
case ((item,cost),index) => println(s"$index: $item costs $cost each")
}
}
/*
output:
[root@master scalar]# scala match-deep-tuple.sc
0: Pencil costs 0.52 each
1: Paper costs 1.35 each
2: Notebook costs 2.43 each
在交互式shell界面,使用 :load file 可以观察变量的类型和运行的输出
scala> :load /data/project/scalar/match-deep-tuple.sc
Loading /data/project/scalar/match-deep-tuple.sc...
itemsCosts: Seq[(String, Double)] = List((Pencil,0.52), (Paper,1.35), (Notebook,2.43))
itemsCostsIndices: Seq[((String, Double), Int)] = List(((Pencil,0.52),0), ((Paper,1.35),1), ((Notebook,2.43),2))
0: Pencil costs 0.52 each
1: Paper costs 1.35 each
2: Notebook costs 2.43 each
*/
val days = Array("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday
scala> for (day <- days.zipWithIndex) { day match { case (day,index) => println(s"$index is: $day")}}
0 is: Sunday
1 is: Monday
2 is: Tuesday
3 is: Wednesday
4 is: Thursday
5 is: Friday
6 is: Saturday
// 另外一种写法,更简便
scala> days.zipWithIndex.foreach { case (day,index) => println(s"$index is: $day")}
0 is: Sunday
1 is: Monday
2 is: Tuesday
3 is: Wednesday
4 is: Thursday
5 is: Friday
6 is: Saturday
// 将上面的一条语句展开
scala> days.zipWithIndex.foreach(println(_))
(Sunday,0)
(Monday,1)
(Tuesday,2)
(Wednesday,3)
(Thursday,4)
(Friday,5)
(Saturday,6)
scala> days.zipWithIndex.foreach(x=> println(x))
(Sunday,0)
(Monday,1)
(Tuesday,2)
(Wednesday,3)
(Thursday,4)
(Friday,5)
(Saturday,6)
// 注意如果没有加case 就会报x,y 丢失类型
scala> days.zipWithIndex.foreach{case (x,y)=> println(x,y)}
(Sunday,0)
(Monday,1)
(Tuesday,2)
(Wednesday,3)
(Thursday,4)
(Friday,5)
(Saturday,6)
4.6.1 unapply 方法
- case 类有一个伴生对象,伴生对象自动生成一些方法,其中方法名为apply的工厂方法,用于构造对象。方法名unapply的工厂方法,用于提取和”解构”
// 调用 unapply 方法
Pseron match {
case Person("Alice",25,Address(_,"Chicago",_)) => ...
}
- 所有的unapply 方法都返回Option[TupleN[…]] 此处的N表示可以从对象中提取值的个数。在Person这个case 类中,N为3。被提取的值的类型与元组中相应元素的类型一致。对于Person 而言,提取值的类型分别为 String、Int 和 Address。所以编译器生成的Person的伴随对象是这样的:
object Person {
def apply(name: String,age: Int, address: Address) =
new Person(name,age,address)
def unapply(p: Person): Option[Tuple3[Strong,Int,Address]] =
Some((p.name,p.age,p.address))
}
这里有个疑问:既然编译器已经知道对象是Person,为什么返回值还要用Option?因为,Scala允许unapply 方法否决这个匹配,返回None,这时,Scala会使用下一个case子句。另外,还可以隐藏一些不希望暴露的对象属性,如:age,可以使用unapplySeq
- 为了获得性能上的优势,Scala 2.11.1 放松了对unapply 必须返回Option[T[ 的要求,现在unapply 能够返回任意类型,只要该类型具有以下方法:
def isEmpty: Boolean
def get: T
- 如果有必要,unapply 方法会被递归调用。像本例中,Person中有一个嵌套的Address 对象。类似地,在元组的那个列子中,也递归调用了unapply
- case 关键字,用于声明一种”特殊”的类,还用于match表达式中case表达式,这不是巧合。case 类的特性就是为更便捷地进行模式匹配而设计的
- 返回类型:Option[Tuple3[String,Int,Address]] 写法显得太冗长,scala允许使用元组字面量来处理这种类型
val t1: Option[Tupl3[String,Int,Address]] = ...
val t2: Option[ (String,Int,Address) ] = ...
val t3: Option[ (String, Int, Address) ] = ...
- “+:” 这个操作符可以构造一个序列
scala> val list1 = 1 +: 2 +: 3 +: 4 +: Nil
list1: List[Int] = List(1, 2, 3, 4)
// +: 是一个向右结合的操作符,因此先将4追加到Nil中,在将3追加到产生的序列中,依次类推
- scala 希望尽可能地支持构造和解构/提取的标准语法,这两个操作是成对的,互为逆操作
- “+:” 为一个单列对象,类似于方法名,在Scala中类型名可以使用的字符也很广泛,这样就解决了使用”+:” 构造任意非空集合
- “+:” 类型名只有一个方法:unapply
def unapply[T,coll](collection: Coll): Option[(T,Coll)]
// 头部推断的类型是T,尾部推断的类型是Coll 某种集合类型(如:List,Tuple,Vector等)。Coll同时也是输入的结合类型,于是,方法返回了一个Option,其内容为输入集合的头部和尾部组成的两元素元组
- case head +: tail => … 编译器调用 +:.unapply(collection)。也可以写成 case +:(head,tail) => … 。 case head +: tail 中缀表达式
scala> def processSeq2[T](l: Seq[T]): Unit = l match {
| case +:(head,tail) =>
| printf("%s +: ",head)
| processSeq2(tail)
| case Nil => println("Nil")
| }
scala> processSeq2(List(1,2,3,4,5))
1 +: 2 +: 3 +: 4 +: 5 +: Nil
- case 使用 中缀表达式,示例:
//With[String,Int] 和 String With Int 这两种类型签名写法
val With1: With[String,Int] = With("Foo",1)
val With2: String With Int = With("Bar",2)
Seq(With1,With2) foreach {
w => w match {
// 中缀表达式
case s With i => println(s"$s with $i")
// 也可以写成:case With(s,i) => println(s"$s with $i")。感觉这样看起来更舒服些
case _ => println(s"Unknown: $w")
}
}
// 注意这种写法是错误的
scala> val w = "one" With 2
<console>:11: error: value With is not a member of String
val w = "one" With 2
- List逆序访问元素
val nonEmptyList = List(1,2,3,4,5)
val nonEmptyVector = Vector(1,2,3,4,5)
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)
def reverseSeqToString[T](l: Seq[T]): String = l match {
// 注意这里是":+",之前是"+:"
case prefix :+ end => reverseSeqToString(prefix) + s" :+ $end"
case Nil => "Nil"
}
for (seq <- Seq(nonEmptyList,nonEmptyVector,nonEmptyMap.toSeq)) {
println(reverseSeqToString(seq))
}
/*
output:
[root@master scalar]# scala match-reverse-seq.sc
Nil :+ 1 :+ 2 :+ 3 :+ 4 :+ 5
Nil :+ 1 :+ 2 :+ 3 :+ 4 :+ 5
Nil :+ (one,1) :+ (two,2) :+ (three,3)
*/
同样可以用输出的内容重新构造一个集合
scala> Nil :+ 1 :+ 2 :+ 3 :+ 4 :+ 5
res37: List[Int] = List(1, 2, 3, 4, 5)
scala> Nil :+ ("one",1) :+ ("two",2) :+ ("three",3)
res39: List[(String, Int)] = List((one,1), (two,2), (three,3))
- 对于List,用于追加元素的方法”:+” 以及用于模式匹配的”:+” 方法均需O(n)的时间复杂度,这两个方法必须要从列表的头部遍历一遍。而对于其他某些序列,如Vector,则需要O(1)的时间复杂度
4.6.2 unapplySeq 方法
val nonEmptyList = List(1,2,3,4,5)
val emptyList = Nil
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)
def windows[T](seq: Seq[T]): String = seq match {
case Seq(head1,head2,_*) => s"($head1,$head2), " + windows(seq.tail)
case Seq(head,_*) => s"($head,_), " + windows(seq.tail)
case Nil => "Nil"
}
for (seq <- Seq(nonEmptyList,emptyList,nonEmptyMap.toSeq)) {
println(windows(seq))
}
/*
output:
[root@spark1 scala]# scala match-seq-unapplySeq.sc
(1,2), (2,3), (3,4), (4,5), (5,_), Nil
Nil
((one,1),(two,2)), ((two,2),(three,3)), ((three,3),_), Nil
*/
/*
递归函数解释
List(1,2,3,4,5) case Seq(head1,head2,_*) => s"($head1,$head2), " + windows(tail)
tail = List(2,3,4,5) (1,2) + windows(List(2,3,4,5))
tail = List(3,4,5) (1,2),(2,3) + windows(3,4,5)
tail = List(4,5) (1,2),(2,3),(3,4) + windows(4,5)
匹配第2条case语句,上面都是匹配第1条case语句
tail = List(5) (1,2),(2,3),(3,4),(4,5) + windows(5)
匹配最后一条case语句
tail = List() (1,2),(2,3),(3,4),(4,5),(5,_),Nil
nonEmptyMap.toSeq
res0: Seq[(String, Int)] = ArrayBuffer((one,1), (two,2), (three,3))
匹配第1条case语句
tail = ArrayBuffer((two,2), (three,3)) ((one,1),(two,2) + windows(ArrayBuffer((two,2), (three,3)))
匹配第2条case语句
tail = ArrayBuffer((three,3)) ((one,1),(two,2)),((two,2),(three,3)) + windows(ArrayBuffer((three,3)))
匹配最后1条case语句
tail = ArrayBuffer() ((one,1),(two,2)),((two,2),(three,3)),((three,3),_) + Nil
*/
在match语句中,看起来似乎隐含调用Seq.apply(),但实际上是调用的Seq.unapplySeq。提前前两个元素,”_*” ,表示后面有1个或者多个元素,再或者没有元素,和正则表达式中的”*” 类似
在case语句中使用 “+:” ,使代码显得更优雅
val nonEmptyList = List(1,2,3,4,5)
val emptyList = Nil
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)
def windows2[T](seq: Seq[T]): String = seq match {
// case Seq(head1,head2,_*) => s"($head1,$head2), " + windows(seq.tail)
case head1 +: head2 +: tail => s"($head1,$head2), " + windows2(seq.tail)
//case Seq(head,_*) => s"($head,_), " + windows2(seq.tail)
case head +: tail => s"($head,_), " + windows2(seq.tail)
case Nil => "Nil"
}
for (seq <- Seq(nonEmptyList,emptyList,nonEmptyMap.toSeq)) {
println(windows2(seq))
}
/*
output:
scala match-seq-without-unapplySeq.sc
(1,2), (2,3), (3,4), (4,5), (5,_), Nil
Nil
((one,1),(two,2)), ((two,2),(three,3)), ((three,3),_), Nil
*/
Seq 提供了两个方法用于创建滑动窗口
scala> val seq = Seq(1,2,3,4,5)
seq: Seq[Int] = List(1, 2, 3, 4, 5)
scala> val slide2 = seq.sliding(2)
slide2: Iterator[Seq[Int]] = non-empty iterator
// slide2.toSeq 是一个惰性列表,先创建列表的头部,尾部元素只有在用到的时候才对尾部元素求值。
// 每调用一次就会求一个值,而之前所求的值会弹出,一直调用。该列表将被置空
scala> slide2.toSeq
res4: Seq[Seq[Int]] = Stream(List(1, 2), ?)
scala> slide2.toSeq
res5: Seq[Seq[Int]] = Stream(List(2, 3), ?)
scala> slide2.toSeq
res6: Seq[Seq[Int]] = Stream(List(3, 4), ?)
scala> slide2.toSeq
res7: Seq[Seq[Int]] = Stream(List(4, 5), ?)
scala> slide2.toSeq
res8: Seq[Seq[Int]] = Stream()
// slide2.toList 是一次性创建列表的所有元素,这个对于大序列代价太过于昂贵
scala> slide2.toList
res10: List[Seq[Int]] = List(List(1, 2), List(2, 3), List(3, 4), List(4, 5))
scala> Seq(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).sliding(3,1).toList
res20: List[Seq[Int]] = List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5), List(4, 5, 6), List(5, 6, 7), List(6, 7, 8), List(7, 8, 9), List(8, 9, 10))
scala> Seq(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).sliding(3,2).toList
res18: List[Seq[Int]] = List(List(1, 2, 3), List(3, 4, 5), List(5, 6, 7), List(7, 8, 9), List(9, 10))
scala> Seq(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).sliding(3,3).toList
res19: List[Seq[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9), List(10))
// sliding(x,y) x:返回的列表中有几个元素,y:表示滑动窗口的大小,也可以称为步长
4.7 可变参数列表的匹配
// 定义一个枚举类型,用于表示比较SQL操作符,每个操作符都有一个名字,是一个字符串
object Op extends Enumeration {
type Op = Value
val EQ = Value("=")
// println("EQ: ",EQ)
val NE = Value("!=")
val LTGT = Value("<>")
val LT = Value("<")
val LE = Value("<=")
val GT = Value(">")
val GE = Value(">=")
}
import Op._
// 用于表示 WHERE x OP y 的 case 类
case class WhereOp[T](columnName: String,val1: T,vals: T*)
// 用于表示 WHERE x IN (val1,val2,...)
case class WhereIn[T](columnName: String,val1: T,vals: T*)
// 用于解析示例对象
val Wheres = Seq(
WhereIn("state","IL","CA","VA"),
WhereOp("state",EQ,"IL"),
WhereOp("name",EQ,"Buck Trends"),
WhereOp("age",GT,"29")
)
// 匹配可变参数的语法形式:name @ _*
for (where <- Wheres) {
where match {
case WhereIn(col,val1,vals @ _*) =>
// val s1 = val1 +: vals = ArrayBuffer(IL, CA, VA)
// s1.mkString = IL, CA, VA
val valStr = (val1 +: vals).mkString(", ")
println(s"WHERE $col IN ($valStr)")
case WhereOp(col,op,value) => println(s"WHERE $col $op $value")
case _ => println(s"ERROR: Unknown expression: $where")
}
4.8 正则表达式的匹配
/*
\s 匹配除了\n,\t,\t,\f,\v 以外的任意一个字符
a+ 匹配1个或多个a,最少匹配1个,相当于a{1,}
[^,] 匹配除了逗号以外的任意一个字符
a* 匹配1个a或者多个a,或者0个a
*/
//val BookExtractorRE = """Book: title=([^,]+),\s+author=(.+)""".r
val BookExtractorRE = """Book: title=([^,]+),\s*author=(.+)""".r
val MagazineExtractorRE = """Magazine: title=([^,]+),\s+issue=(.+)""".r
val catalog = Seq(
// "Book: title=Programming Scala Second Edition, author=Dean Wampler",
"Book: title=Programming Scala Second Edition,author=Dean Wampler",
"Magazine: title=The New Yorker, issue=January 2014",
"Unknown text=Who put this here??"
)
for (item <- catalog) {
item match {
case BookExtractorRE(title,author) =>
println(s"""Book "$title",written by $author""")
case MagazineExtractorRE(title,issue) =>
println(s"""Magazine "$title",issue $issue""")
case entry => println(s"Unrecognized entry: $entry")
}
}
/*
output:
[root@master scalar]# scala match-regex.sc
Book "Programming Scala Second Edition",written by Dean Wampler
Magazine "The New Yorker",issue January 2014
Unrecognized entry: Unknown text=Who put this here??
*/
- 在3个双引号内的正则表达式中使用变量插值是无效的,需要对变量插值进行转义。
s"""$first\\s+$second""".r
s"""$first\s+$second""".r // 这个是错误的
// 这两句的区别在于\s 是否转义
- 如果正则表达式不是使用3引号括起来,\s 就要转义使用:\\s
- scala.util.match.Regex 定义了若干个用于正则表达式其他操作方法,如查找和替换
4.9 再谈case语句的变量绑定
case class Address(street: String,city: String,country: String)
case class Person(name: String,age: Int,address: Address)
val alice = Person("Alice",25,Address("1 Scala Lane", "Chicago", "USA"))
val bob = Person("Bob",29,Address("2 Java Ave.","Miamit","USA"))
val charlie = Person("Charlie",32,Address("3 Python Ct.","Boston","USA"))
for (person <- Seq(alice,bob,charlie)) {
person match {
// 给变量取别名
case p @ Person("Alice",25,address) => println(s"Hi Alice! $p")
case p @ Person("Bob",29,a @ Address(street,city,country)) =>
println(s"Hi ${p.name}! age ${p.age}, in ${a.city}")
case p @ Person(name,age,_) =>
println(s"Who are you,$age year-old person name $name? $p")
}
}
/*
output:
[root@master scalar]# scala match-deep2.sc
Hi Alice! Person(Alice,25,Address(1 Scala Lane,Chicago,USA))
Hi Bob! age 29, in Miamit
Who are you,32 year-old person name Charlie? Person(Charlie,32,Address(3 Python Ct.,Boston,USA))
*/
4.10 再谈类型匹配
for { x <- Seq(List(5.5,5.6,5.7),List("a","b"),List(1,2,3,4),List())
} yield (x match {
//case seqd: Seq[Double] => println("seq double",seqd)
//case seqs: Seq[String] => println("seq string",seqs)
case seq: Seq[_] => println(s"Seq ${doSeqMatch(seq)}",seq)
case _ => ("unknown!",x)
})
def doSeqMatch[T](seq: Seq[T]): String = seq match {
case Nil => "Nothing"
case head +: _ => head match {
case _ : Double => "Double"
case _ : String => "String"
case _ => "Unmatched seq element"
}
}
/*
output:
(Seq Double,List(5.5, 5.6, 5.7))
(Seq String,List(a, b))
(Seq Unmatched seq element,List(1, 2, 3, 4))
(Seq Nothing,List())
*/
/*
这两句代码会有警告这些警告来源于JVM的类型擦除,类型擦除是Java 5 引入泛型后的一个历史遗留。
为了避免与旧版本的代码断代,jvm的字节码不会记住一个泛型实例(如List) 中实际传入的类型参数的信息
编译器只能识别输入对象为List,但无法在运行时识别它是List[Double] 还是 List[String] 时,编译器会发出警告.
事实上编译器认为第二个匹配List[String]的case子句是不可达代码,意味着第一个匹配List[Doubel] 的case 子句可以匹配任意List。
对于两个输入,都打印机seq double
*/
case seqd: Seq[Double] => println("seq double",seqd)
case seqs: Seq[String] => println("seq string",seqs)
4.11 封闭继承层级与全覆盖匹配
// 定义了一个封闭的抽象基类,由于该类被定义为封闭的,其子类型必须定义在本文件内
sealed abstract class HttpMethod() {
def body: String
def bodyLength = body.length
}
// 这8个继承自HttpMethod 的case 类,每个类均在构造方法中声明了参数body:String,
// 由于每个类均为case类,因此该参数是一个val,它实现了HTTPMethod的抽象方法def
// 对封闭基类的实例做模式匹配时,如果case语句覆盖了所有当前文件定义的类型,那么匹配就是全覆盖,由于不允许有其他用于自定义的子类型,随着项目的演进,匹配的全覆盖性也不会丧失
case class Connect(body: String) extends HttpMethod
case class Delete (body: String) extends HttpMethod
case class Get (body: String) extends HttpMethod
case class Head (body: String) extends HttpMethod
case class Options(body: String) extends HttpMethod
case class Post (body: String) extends HttpMethod
case class Put (body: String) extends HttpMethod
case class Trace (body: String) extends HttpMethod
def handle (method: HttpMethod) = method match {
case Connect (body) => s"connect: (length: ${method.bodyLength}) $body"
case Delete (body) => s"delete:(length: ${method.bodyLength}) $body"
case Get (body) => s"get: (length: ${method.bodyLength}) $body"
case Head (body) => s"head: (length: ${method.bodyLength}) $body"
case Options (body) => s"options: (length: ${method.bodyLength}) $body"
case Post (body) => s"post: (length: ${method.bodyLength}) $body"
case Put (body) => s"put: (length: ${method.bodyLength}) $body"
case Trace (body) => s"trace: (length: ${method.bodyLength}) $body"
}
val methods = Seq(
Connect ("connect body..."),
Delete ("delete body..."),
Get ("get body..."),
Head ("head body..."),
Options ("option body..."),
Post ("post body..."),
Put ("put body..."),
Trace ("trace body..."))
methods foreach (method => println(handle(method)))
/*
output:
[root@master scalar]# scala http.sc
connect: (length: 15) connect body...
delete:(length: 14) delete body...
get: (length: 11) get body...
head: (length: 12) head body...
options: (length: 14) option body...
post: (length: 12) post body...
put: (length: 11) put body...
trace: (length: 13) trace body...
*/
- HttpMethod 的case 类很小,理论上可以用Enumeration 代替。但那样会有一个很大的缺陷,就是编译器无法判断Enumeration 相应的match语句是否全覆盖。
- 如果使用了Enumeration,而在match语句中忘记了匹配Trace的语句,只能在运行时抛出MatchError的时候才知道这个错误的存在
4.12 模式匹配的其他用法
// 定义变量的时候使用模式匹配
scala> case class Address(street: String,city: String,country: String)
defined class Address
scala> case class Person(name: String,age: Int,address: Address)
defined class Person
// 只需要一个步骤,就可以将Person中所有的属性抽取出来,同事略过了不需要的属性
scala> val Person(name,age,Address(_,state,_)) =
| Person("Dean",29,Address("1 Scala Way","CA","USA"))
name: String = Dean
age: Int = 29
state: String = CA
// List 也可以使用同样的方法取出所有的元素,同时也可以略过不需要的元素
scala> val head +: tail = List(1,2,3)
head: Int = 1
tail: List[Int] = List(2, 3)
scala> val head1 +: head2 +: tail = Vector(1,2,3)
head1: Int = 1
head2: Int = 2
tail: scala.collection.immutable.Vector[Int] = Vector(3)
scala> val head3 +: _ +: tail = List(4,5,6)
head3: Int = 4
tail: List[Int] = List(6)
scala> val Seq(a,b,c) = List(1,2,3)
a: Int = 1
b: Int = 2
c: Int = 3
// 变量的个数与List元素的个数要匹配
scala> val Seq(a,b,c) = List(1,2,3,4)
scala.MatchError: List(1, 2, 3, 4) (of class scala.collection.immutable.$colon$colon)
... 32 elided
// if 表达式中也可以使用模式匹配
scala> val p = Person("Dean",29,Address("1 Scala Way","CA","USA"))
p: Person = Person(Dean,29,Address(1 Scala Way,CA,USA))
scala> if (p == Person("Dean",29,Address("1 Scala Way","CA","USA"))) "yes" else "no"
res0: String = yes
scala> if (p == Person("Dean",29,Address("2 Scala Way","CA","USA"))) "yes" else "no"
res1: String = no
// 但是这里无法使用占位符 "_"
scala> if ( p== Person(_,29,Address(_,_,"USA"))) "yes" else "no"
<console>:17: error: missing parameter type for expanded function ((x$1) => p.$eq$eq(Person(x$1, 29, ((x$2, x$3) => Address(x$2, x$3, "USA")))))
if ( p== Person(_,29,Address(_,_,"USA"))) "yes" else "no"
^
<console>:17: error: missing parameter type for expanded function ((x$2, x$3) => Address(x$2, x$3, "USA"))
if ( p== Person(_,29,Address(_,_,"USA"))) "yes" else "no"
^
<console>:17: error: missing parameter type for expanded function ((x$2: <error>, x$3) => Address(x$2, x$3, "USA"))
if ( p== Person(_,29,Address(_,_,"USA"))) "yes" else "no"
scala> def sum_count(ints: Seq[Int]) = (ints.sum,ints.size)
sum_count: (ints: Seq[Int])(Int, Int)
scala> val (sum,count) = sum_count(List(1,2,3,4,5))
sum: Int = 15
count: Int = 5
for 循环中使用模式匹配
val dogBreeds = Seq(Some("Doberman"),
None,Some("Yorkshire Terrier"),
Some("Dachshund"),
None,Some("Scottish Terrier"),
None,Some("Great Dane"),Some("Portuguest Water Dog"))
println("second pass: ")
for {
Some(breed) <- dogBreeds
upcasedBreed = breed.toUpperCase()
} println(upcasedBreed)
/*
output:
[root@spark1 scala]# scala scoped-option-for.sc
second pass:
DOBERMAN
YORKSHIRE TERRIER
DACHSHUND
SCOTTISH TERRIER
GREAT DANE
PORTUGUEST WATER DOG
*/
函数字面量中使用模式匹配
case class Person(name: String,age: Int)
val as = Seq(
Address("1 Scala Lance","Anytown","USA"),
Address("2 Clojure Lane","Othertown","USA"))
val ps = Seq(
Person("Buck Trends",29),
Person("Clo Jure",28))
val pas = ps zip as
// 不太美观的方法
pas map { tup =>
val Person(name,age) = tup._1
val Address(street,city,country) = tup._2
println(s"name: $name (age: $age) lives at $street,$city,in $country")
}
// 不错的方法。偏函数,在语法上更为简洁,特别是从元组和更复杂的结构中抽取值时,更适用。只是使用偏函数,case表达式必须精确匹配输入,否则在运行时会抛出一个MatchError
pas map {
case (Person(name,age),Address(street,city,country)) =>
println(s"name: $name (age: $age) lives at $street,$city,in $country")
}
/*
output:
[root@spark1 scala]# scala match-fun-args.sc
name: Buck Trends (age: 29) lives at 1 Scala Lance,Anytown,in USA
name: Clo Jure (age: 28) lives at 2 Clojure Lane,Othertown,in USA
name: Buck Trends (age: 29) lives at 1 Scala Lance,Anytown,in USA
name: Clo Jure (age: 28) lives at 2 Clojure Lane,Othertown,in USA
*/
正则表达式使用模式匹配去解构字符串。这里举例SQL解析简单程序
scala> val cols = """\*|[\w,]+"""
cols: String = \*|[\w,]+
scala> val table = """\w+"""
table: String = \w+
scala> val tail = """.*"""
tail: String = .*
// 由于使用了变量插值,所以\s需要转义,如果没有使用变量插值,3重引号下的正则表达式是不需要转义的
scala> val selectRE = s"""SELECT \\s*(DISTINCT)?\\s+($cols)\\s*FROM\\s+($table)\\s*($tail)?;""".r
selectRE: scala.util.matching.Regex = SELECT \s*(DISTINCT)?\s+(\*|[\w,]+)\s*FROM\s+(\w+)\s*(.*)?;
scala> val selectRE(distincts,cols1,table1,otherClauses) = "SELECT DISTINCT * FROM atable;
<console>:1: error: unclosed string literal
val selectRE(distincts,cols1,table1,otherClauses) = "SELECT DISTINCT * FROM atable;
^
scala> val selectRE(distincts,cols1,table1,otherClauses) = "SELECT DISTINCT * FROM atable;"
distincts: String = DISTINCT
cols1: String = *
table1: String = atable
otherClauses: String = ""
scala> val selectRE(distinct3,cols3,table3,otherClauses) = "SELECT DISTINCT col1,col2 FROM atable;"
distinct3: String = DISTINCT
cols3: String = col1,col2
table3: String = atable
otherClauses: String = ""
scala> val selectRE(distinct4,cols4,table4,otherClauses) = "SELECT DISTINCT col1,col2 FROM atable WHERE col1 = 'foo';"
distinct4: String = DISTINCT
cols4: String = col1,col2
table4: String = atable
otherClauses: String = WHERE col1 = 'foo'
4.13 总结关于模式匹配的评价
- 模式匹配是一个强大的”协议”,用于从数据结构中提取数据。
- JavaBeans 模型,模式匹配鼓励开发者用getter和 setter 暴露对象的属性,而这种做法忽略了一点,即状态应该被封装,只在恰当的时候才暴露出来,尤其对可变的属性而言更是如此。对状态信息的获取应该小心设计,以反映暴露的抽象
- 设计模式匹配语句时,需要谨慎对待默认的case字句。在什么情况下,才出现”以上均不匹配”的情况呢?默认case字句有可能表明,你该改善一下程序的设计了。这样你会更准确地知道程序中可能发生的所有匹配的情况