scala 程序设计 第4章:模式匹配

4目录:模式匹配

4.1 简单匹配

val bools = Seq(true,false)

for (bool <- bools) {
    bool match {
        case true => println("Got heads",bool)
        case false => println("Got tails",bool)
    }
}
// for 推导式的替代写法
for (bool <- bools) {
    val which = if(bool) "head" else "tails"
    println("Got" + which,bool)
}

/*
output:
[root@master scalar]# scala match-boolean.sc 
(Got heads,true)
(Got tails,false)
如果尝试,将case false => println("Got tails",bool) 这条语句注释,scala 编译器会报scala.MatchError错误。bools有两个值,但是match 只匹配了true没有匹配false,所以会报错,match需要匹配bools中所有的值

scala> val bool = Seq(true,false)
bool: Seq[Boolean] = List(true, false)
*/

4.2 match中的值、变量和类型

for {
    x <- Seq(1,2,2.7,"one","two",'four)
} {
    val str = x match {
        case 1 => "int 1"
        case i: Int => "other int: " + i
        case i: String => "other string: " + i
        case d: Double => "a double: " + x
        case "one" => "string one"
        case s: String => "other string: " + s
        case unexpected => "unexpected value: " + unexpected
    }
    println(str)
}
/*
output:
int 1
other int: 2
a double: 2.7
string one
other string: two
unexpected value: 'four
*/

// 改装上面的代码
for {
    x <- Seq(1,2,2.7,"one","two",'four)
} {
    val str = x match {
        case 1 => "int 1"
        // case i: Int => "other int: " + i
        case s: String => "other string: " + s
        case d: Double => "a double: " + x
        // case "one" => "string one"
        case i: Int => "other int: " + i
        // case s: String => "other string: " + s
        case unexpected => "unexpected value: " + unexpected
    }
    println(str)
}
/*
output:
int 1
other int: 2
a double: 2.7
other string: one
other string: two
unexpected value: 'four
*/

/* 
从改装的来看  x <- Seq(1,2,2.7,"one","two",'four),x 取值按原始顺序1,2,2.7,"one","two",'four,总共有6个元素,需要匹配6次,但并不一定需要6case语句,一条case 可以匹配多个元素
第一次匹配:元素1和所有的case语句进行匹配,只要匹配成功就不在往下匹配,case 1 => "int 1" 这条匹配成功,不在匹配  case i: Int => "other int: " + i
第二次匹配:元素2和所有的case语句进行匹配, case i: Int => "other int: " + i 这条case 语句匹配成功
第三次匹配:元素2.7和所有的case语句进行匹配,case d: Double => "a double: " + x 这条case 语句匹配成功
第四/五次匹配:元素"one","two" 这两个元素和 case i: String => "other string: " + i 这条匹配,所以可以将一个case语句匹配两个元素,从而也说明了,并不是有多少个待匹配的元素就需要多少天case语句
第六次匹配:'four 这是一个符号字面量,只和最后一条case语句匹配 case unexpected => "unexpected value: " + unexpected。这条语是万能语句,只要不与前面匹配的都可以与这个匹配
*/

// 再次改装上面的代码,使用"_" 代替变量"i","s","d""unexpected"
cat match-variable2.sc
for {
    x <- Seq(1,2,2.7,"one","two",'four)
} {
    val str = x match {
        case 1 => "int 1"
        case _: Int => "other int: " + x
        case _: Double => "a double: " + x
        case "one" => "string one"
        case _: String => "other string: " + x
        case _ => "unexpected value: " + x
    }
    println(str)
}
  1. 注意:除了偏函数,所有的match语句必须完全覆盖所有的输入,当输入类型为Any,case _ 或者 case some_name 作为默认子句
  2. 匹配是按顺序进行的,因此具体的子句应该出现在宽泛的子句之前。否则具体的语句将不可能有机会匹配上,所以,默认子句必须是最后一个子句
  3. case 语句,编译器假定以大写字母开头的为类型名,以小写字母开头的为变量名。这是case 子句的一个坑
  4. case 子句中,如果需要引用之前已经定义的变量是,应使用反引号将其包围
def checkY(y: Int) = {
    for { x <- Seq(99,100,101)
    } {
        val str = x match {

            // case y => "found y!" 编译错误,不符合scala语法规范
            case `y` => "found y!"
            case i: Int => "int: " + i
        }
        println(str)
    }
}

checkY(100)

/*
output:
[root@master scalar]# scala match-surprise.sc 
int: 99
found y!
int: 101
*/

这里解释下 case y => “fount y!” 这句发生错误的原因
case y 的含义其实是:匹配所有输入(由于这里没有类型注解),并将其赋值给新的变量y,这里的y没有被解析为方法参数y,因此,事实上我们将一个默认的、匹配一切的语句写在了第一行

for {
    x <- Seq(1,2,2.7,"one","two",'four)
} {
    val str = x match {
        case _: Int | _: Double => "a number: " + x
        case "one" => "string one"
        case _: String => "other string: " + x
        case _ => "unexpected value: " + x
    }
    println(str)
}

/*
output:
a number: 1
a number: 2
a number: 2.7
string one
other string: two
unexpected value: 'four
*/
// case 子句支持逻辑或操作,使用 "|" 方法即可。这样IntDouble 类型的值都匹配第一条子句

scala 语法规范

4.3 序列的匹配

  1. Seq 序列是具体集合类型的父类型,这些集合支持以确定的顺序遍历其元素,如List 和 Vector
  2. Seq的行为很符合链表的定义,因为在链表中,每个头结点除了含有自身的值以外,还指向链表的尾部(即链表剩下的元素),从而创建了一种层级结构类似以下4个节点所组成的序列。tail好比指向链表尾部的指针。

(node1,(node2,(node3,(node4,(end))))

val nonEmptySeq = Seq(1,2,3,4,5)
val emptySeq = Seq.empty[Int]
val nonEmptyList = List(1,2,3,4,5)
val emptyList = Nil
val nonEmptyVector = Vector(1,2,3,4,5)
val emptyVector = Vector.empty[Int]
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)
val emptyMap = Map.empty[String,Int]

def seqToString[T](seq: Seq[T]): String = seq match {
    // head,tail 是两个变量名
    case head +: tail => s"$head +: " + seqToString(tail)
    case Nil => "Nil"

}

for (seq <- Seq (
    nonEmptySeq,emptySeq,nonEmptyList,emptyList,nonEmptyVector,emptyVector,nonEmptyMap.toSeq,emptyMap.toSeq)) {
    println(seqToString(seq))
}

/*
output:
[root@master scalar]# scala match-seq.sc 
1 +: 2 +: 3 +: 4 +: 5 +: Nil
Nil
1 +: 2 +: 3 +: 4 +: 5 +: Nil
Nil
1 +: 2 +: 3 +: 4 +: 5 +: Nil
Nil
(one,1) +: (two,2) +: (three,3) +: Nil
Nil


scala> val l = List(1,2,3,4,5)
l: List[Int] = List(1, 2, 3, 4, 5)
head 序列中的第1个元素
scala> l.head
res7: Int = 1
tail  序列中除了第1个元素外,所有其他的元素
scala> l.tail
res8: List[Int] = List(2, 3, 4, 5)

MaptoSeq: 注意Map的定义使用(),不能使用{},这与Python是有区别的
scala> val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)
nonEmptyMap: scala.collection.immutable.Map[String,Int] = Map(one -> 1, two -> 2, three -> 3)

scala> nonEmptyMap.toSeq
res0: Seq[(String, Int)] = ArrayBuffer((one,1), (two,2), (three,3))

递归函数解析
Seq(1,2,3,4,5)  case head +: tail => s"head +: " + seqToString(tail)
tail = List(2,3,4,5)    ---> 1 +: + seqToString(List(2,3,4,5))

Seq(2,3,4,5) case head +: tail => s"head +: " + seqToString(tail)
tail = List(3,4,5)   ----> 1+: 2 +: + seqToString(List(3,4,5))

Seq(3,4,5) case head +: tail => s"head +: " + seqToString(tail)
tail = List(4,5) ----> 1 +:2 + :3 + seqToString(List(4,5))

Seq(4,5) case head +: tail => s"head +: " + seqToString(tail)
tail = List(5) ----> 1 +:2 + :3 + :4 + seqToString(List(5))

Seq(5) case head +: tail => s"head +: " + seqToString(tail)
tail = List() ----> 1 +:2 + :3 + :4 + :5 + seqToString(List())

List() case Nil => "Nil" Seq.tail  为空后就匹配第二个case,到此Seq中所有的元素都匹配

*/

修改代码使输出显得有层次结构

val nonEmptySeq = Seq(1,2,3,4,5)
val emptySeq = Seq.empty[Int]
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)

def seqToString2[T](seq: Seq[T]): String = seq match {
    case head +: tail => s"($head +: ${seqToString2(tail)})"
    case Nil => "(Nil)"
}

for (seq <- Seq(nonEmptySeq,emptySeq,nonEmptyMap.toSeq)) {
    println(seqToString2(seq))
}

/*
output:
[root@spark1 scala]# scala match-seq-paren.sc 
(1 +: (2 +: (3 +: (4 +: (5 +: (Nil))))))
(Nil)
((one,1) +: ((two,2) +: ((three,3) +: (Nil))))
*/

// scala 2.10 之前处理List的另一种方法。当然在scala 2.11.8 上也是可以这样使用的。

val nonEmptyList = List(1,2,3,4,5)
val emptyList = Nil

def listToString[T](list: List[T]): String = list match {
    case head :: tail => s"($head :: ${listToString(tail)})"
    case Nil => "(Nil)"
}


for (l <- List(nonEmptyList,emptyList)) { println(listToString(l)) }

// 将 :+ 改成 ::
/*
output:
(1 :: (2 :: (3 :: (4 :: (5 :: (Nil))))))
(Nil)
*/
// 使用输出的内容重新构造一个集合
scala> val s1 = (1 :: (2 :: (3 :: (4 :: (5 :: (Nil))))))
s1: List[Int] = List(1, 2, 3, 4, 5)

scala> val l = (1 :: (2 :: (3 :: (4 :: (5 :: (Nil))))))
l: List[Int] = List(1, 2, 3, 4, 5)

scala> val s2 = (("one",1) +: (("two",2) +: (("three",3) +: Nil))) 
s2: List[(String, Int)] = List((one,1), (two,2), (three,3))

// Map.apply 工厂方法需要一个可变参数列表,其中的参数是由2个元素组成的元组。
// 所以为了使用序列s2 来构造Map,我们只能:_* 惯用法来将序列s2转化为可变参数列表
scala> val m = Map(s2 :_*)
m: scala.collection.immutable.Map[String,Int] = Map(one -> 1, two -> 2, three -> 3)

4.4 元组的匹配

val langs = Seq(
    ("Scala","Martin","Odersky"),
    ("Clojure","Rich","Hickey"),
    ("Lisp","John","McCarthy"))

for (tuple <- langs) {
    tuple match {
        // 忽略不需要的值
        case ("Scala",_,_) => println("Found Scala")
        // lang,frist,last 只是3个变量而已,可以使用任意的字符串代替,
        // 在这里lang = "Clojure" or "Lisp", frist = "Rich"  or "John", last = "Hickey" or "McCarthy"
        case (lang,first,last) =>
            println(s"Found other lanuage : $lang ($first,$last)")
    }
}

/*
output:
[root@master scalar]# scala match-tuple.sc 
Found Scala
Found other lanuage : Clojure (Rich,Hickey)
Found other lanuage : Lisp (John,McCarthy)
*/

4.5 case 中的guard 语句

for ( i <- Seq(1,2,3,4)) {
    i match {
        // if 两边的表达式不需要括号
        case _ if i %2 ==0 => println(s"even: $i")
        case _ => println(s"odd: $i")
    }
}

4.6 case 类的匹配

case class Address(street: String,city: String,country: String)
case class Person(name: String,age: Int,address: Address)

val alice = Person("Alice",25,Address("1 Scala Lane","Chicago","USA"))
val bob = Person("Bob",29,Address("2 Java Ave","Miami","USA"))
val charlie = Person("Charlie",32,Address("3 Python Ct","Boston","USA"))

for (persion <- Seq(alice,bob,charlie)) {
    persion match {
        case Person("Alice",25,Address(_,"Chicago",_)) => println("Hi Alice!")
        case Person("Bob",29,Address("2 Java Ave","Miami","USA")) => println("Hi Bob")
        case Person(name,age,_) => println(s"Who are you,$age year-old persion name $name?")
    }

}

/*
output:
[root@master scalar]# vim match-deep.sc
[root@master scalar]# scala match-deep.sc 
Hi Alice!
Hi Bob
Who are you,32 year-old persion name Charlie?
*/
val itemsCosts = Seq(("Pencil",0.52),("Paper",1.35),("Notebook",2.43))
// zipWithIndex 返回元组的形式((name,cost),index)
val itemsCostsIndices = itemsCosts.zipWithIndex
for (itemCostIndex <- itemsCostsIndices) {
    itemCostIndex match {
        case ((item,cost),index) => println(s"$index: $item costs $cost each")
    }
}

/*
output:
[root@master scalar]# scala match-deep-tuple.sc 
0: Pencil costs 0.52 each
1: Paper costs 1.35 each
2: Notebook costs 2.43 each

在交互式shell界面,使用 :load file 可以观察变量的类型和运行的输出
scala> :load /data/project/scalar/match-deep-tuple.sc
Loading /data/project/scalar/match-deep-tuple.sc...
itemsCosts: Seq[(String, Double)] = List((Pencil,0.52), (Paper,1.35), (Notebook,2.43))
itemsCostsIndices: Seq[((String, Double), Int)] = List(((Pencil,0.52),0), ((Paper,1.35),1), ((Notebook,2.43),2))
0: Pencil costs 0.52 each
1: Paper costs 1.35 each
2: Notebook costs 2.43 each

*/
val days = Array("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday
scala> for (day <- days.zipWithIndex) { day match { case (day,index) => println(s"$index is: $day")}}
0 is: Sunday
1 is: Monday
2 is: Tuesday
3 is: Wednesday
4 is: Thursday
5 is: Friday
6 is: Saturday

// 另外一种写法,更简便
scala> days.zipWithIndex.foreach { case (day,index) => println(s"$index is: $day")}
0 is: Sunday
1 is: Monday
2 is: Tuesday
3 is: Wednesday
4 is: Thursday
5 is: Friday
6 is: Saturday

// 将上面的一条语句展开
scala> days.zipWithIndex.foreach(println(_))
(Sunday,0)
(Monday,1)
(Tuesday,2)
(Wednesday,3)
(Thursday,4)
(Friday,5)
(Saturday,6)

scala> days.zipWithIndex.foreach(x=> println(x))
(Sunday,0)
(Monday,1)
(Tuesday,2)
(Wednesday,3)
(Thursday,4)
(Friday,5)
(Saturday,6)

// 注意如果没有加case 就会报x,y 丢失类型
scala> days.zipWithIndex.foreach{case (x,y)=> println(x,y)}
(Sunday,0)
(Monday,1)
(Tuesday,2)
(Wednesday,3)
(Thursday,4)
(Friday,5)
(Saturday,6)

4.6.1 unapply 方法

  • case 类有一个伴生对象,伴生对象自动生成一些方法,其中方法名为apply的工厂方法,用于构造对象。方法名unapply的工厂方法,用于提取和”解构”
// 调用 unapply 方法
Pseron match {
    case Person("Alice",25,Address(_,"Chicago",_)) => ...
}
  • 所有的unapply 方法都返回Option[TupleN[…]] 此处的N表示可以从对象中提取值的个数。在Person这个case 类中,N为3。被提取的值的类型与元组中相应元素的类型一致。对于Person 而言,提取值的类型分别为 String、Int 和 Address。所以编译器生成的Person的伴随对象是这样的:
object Person {
    def apply(name: String,age: Int, address: Address) = 
        new Person(name,age,address)

    def unapply(p: Person): Option[Tuple3[Strong,Int,Address]] = 
        Some((p.name,p.age,p.address))
}

这里有个疑问:既然编译器已经知道对象是Person,为什么返回值还要用Option?因为,Scala允许unapply 方法否决这个匹配,返回None,这时,Scala会使用下一个case子句。另外,还可以隐藏一些不希望暴露的对象属性,如:age,可以使用unapplySeq

  • 为了获得性能上的优势,Scala 2.11.1 放松了对unapply 必须返回Option[T[ 的要求,现在unapply 能够返回任意类型,只要该类型具有以下方法:
def isEmpty: Boolean
def get: T
  • 如果有必要,unapply 方法会被递归调用。像本例中,Person中有一个嵌套的Address 对象。类似地,在元组的那个列子中,也递归调用了unapply
  • case 关键字,用于声明一种”特殊”的类,还用于match表达式中case表达式,这不是巧合。case 类的特性就是为更便捷地进行模式匹配而设计的
  • 返回类型:Option[Tuple3[String,Int,Address]] 写法显得太冗长,scala允许使用元组字面量来处理这种类型
val t1: Option[Tupl3[String,Int,Address]] = ...
val t2: Option[ (String,Int,Address) ] = ...
val t3: Option[ (String, Int, Address) ] = ...
  • “+:” 这个操作符可以构造一个序列
scala> val list1 = 1 +: 2 +: 3 +: 4 +: Nil
list1: List[Int] = List(1, 2, 3, 4)
// +: 是一个向右结合的操作符,因此先将4追加到Nil中,在将3追加到产生的序列中,依次类推
  • scala 希望尽可能地支持构造和解构/提取的标准语法,这两个操作是成对的,互为逆操作
  • “+:” 为一个单列对象,类似于方法名,在Scala中类型名可以使用的字符也很广泛,这样就解决了使用”+:” 构造任意非空集合
  • “+:” 类型名只有一个方法:unapply
def unapply[T,coll](collection: Coll): Option[(T,Coll)]
// 头部推断的类型是T,尾部推断的类型是Coll 某种集合类型(如:List,Tuple,Vector等)。Coll同时也是输入的结合类型,于是,方法返回了一个Option,其内容为输入集合的头部和尾部组成的两元素元组
  • case head +: tail => … 编译器调用 +:.unapply(collection)。也可以写成 case +:(head,tail) => … 。 case head +: tail 中缀表达式
scala> def processSeq2[T](l: Seq[T]): Unit = l match {
     | case +:(head,tail) =>
     | printf("%s +: ",head)
     | processSeq2(tail)
     | case Nil => println("Nil")
     | }

scala> processSeq2(List(1,2,3,4,5))
1 +: 2 +: 3 +: 4 +: 5 +: Nil
  • case 使用 中缀表达式,示例:
//With[String,Int]  和 String With Int 这两种类型签名写法
val With1: With[String,Int] = With("Foo",1)
val With2: String With Int = With("Bar",2)
Seq(With1,With2) foreach { 
    w => w match {
        // 中缀表达式
        case s With i => println(s"$s with $i")
        // 也可以写成:case  With(s,i) => println(s"$s with $i")。感觉这样看起来更舒服些
        case _ => println(s"Unknown: $w")
    }    
}

// 注意这种写法是错误的
scala> val w = "one" With 2
<console>:11: error: value With is not a member of String
       val w = "one" With 2
  • List逆序访问元素
val nonEmptyList = List(1,2,3,4,5)
val nonEmptyVector = Vector(1,2,3,4,5)
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)

def reverseSeqToString[T](l: Seq[T]): String = l match {
    // 注意这里是":+",之前是"+:"
    case prefix :+ end => reverseSeqToString(prefix) + s" :+ $end"
    case Nil => "Nil"
}

for (seq <- Seq(nonEmptyList,nonEmptyVector,nonEmptyMap.toSeq)) {
    println(reverseSeqToString(seq))
}

/*
output:
[root@master scalar]# scala match-reverse-seq.sc 
Nil :+ 1 :+ 2 :+ 3 :+ 4 :+ 5
Nil :+ 1 :+ 2 :+ 3 :+ 4 :+ 5
Nil :+ (one,1) :+ (two,2) :+ (three,3)

*/

同样可以用输出的内容重新构造一个集合

scala> Nil :+ 1 :+ 2 :+ 3 :+ 4 :+ 5
res37: List[Int] = List(1, 2, 3, 4, 5)

scala> Nil :+ ("one",1) :+ ("two",2) :+ ("three",3)
res39: List[(String, Int)] = List((one,1), (two,2), (three,3))
  • 对于List,用于追加元素的方法”:+” 以及用于模式匹配的”:+” 方法均需O(n)的时间复杂度,这两个方法必须要从列表的头部遍历一遍。而对于其他某些序列,如Vector,则需要O(1)的时间复杂度

4.6.2 unapplySeq 方法

val nonEmptyList = List(1,2,3,4,5)
val emptyList = Nil
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)

def windows[T](seq: Seq[T]): String = seq match {
    case Seq(head1,head2,_*) => s"($head1,$head2), " + windows(seq.tail)
    case Seq(head,_*) => s"($head,_), " + windows(seq.tail)
    case Nil => "Nil"
}

for (seq <- Seq(nonEmptyList,emptyList,nonEmptyMap.toSeq)) {
    println(windows(seq))
}
/*
output:
[root@spark1 scala]# scala match-seq-unapplySeq.sc 
(1,2), (2,3), (3,4), (4,5), (5,_), Nil
Nil
((one,1),(two,2)), ((two,2),(three,3)), ((three,3),_), Nil
*/
/*
递归函数解释
List(1,2,3,4,5) case Seq(head1,head2,_*) => s"($head1,$head2), " + windows(tail)
tail = List(2,3,4,5) (1,2) + windows(List(2,3,4,5))

tail = List(3,4,5) (1,2),(2,3) + windows(3,4,5)

tail = List(4,5)  (1,2),(2,3),(3,4) + windows(4,5)
匹配第2case语句,上面都是匹配第1case语句
tail = List(5) (1,2),(2,3),(3,4),(4,5) + windows(5)
匹配最后一条case语句
tail = List() (1,2),(2,3),(3,4),(4,5),(5,_),Nil

nonEmptyMap.toSeq
res0: Seq[(String, Int)] = ArrayBuffer((one,1), (two,2), (three,3))
匹配第1case语句
tail = ArrayBuffer((two,2), (three,3))  ((one,1),(two,2) + windows(ArrayBuffer((two,2), (three,3)))
匹配第2case语句
tail = ArrayBuffer((three,3))  ((one,1),(two,2)),((two,2),(three,3)) + windows(ArrayBuffer((three,3)))
匹配最后1case语句
tail = ArrayBuffer() ((one,1),(two,2)),((two,2),(three,3)),((three,3),_) + Nil
*/

在match语句中,看起来似乎隐含调用Seq.apply(),但实际上是调用的Seq.unapplySeq。提前前两个元素,”_*” ,表示后面有1个或者多个元素,再或者没有元素,和正则表达式中的”*” 类似

在case语句中使用 “+:” ,使代码显得更优雅

val nonEmptyList = List(1,2,3,4,5)
val emptyList = Nil
val nonEmptyMap = Map("one" -> 1,"two" -> 2,"three" -> 3)

def windows2[T](seq: Seq[T]): String = seq match {
    // case Seq(head1,head2,_*) => s"($head1,$head2), " + windows(seq.tail)
    case head1 +: head2 +: tail => s"($head1,$head2), " + windows2(seq.tail)
    //case Seq(head,_*) => s"($head,_), " + windows2(seq.tail)
    case head +: tail => s"($head,_), " + windows2(seq.tail)
    case Nil => "Nil"
}

for (seq <- Seq(nonEmptyList,emptyList,nonEmptyMap.toSeq)) {
    println(windows2(seq))
}
/*
output:
scala match-seq-without-unapplySeq.sc
(1,2), (2,3), (3,4), (4,5), (5,_), Nil
Nil
((one,1),(two,2)), ((two,2),(three,3)), ((three,3),_), Nil
*/

Seq 提供了两个方法用于创建滑动窗口

scala> val seq = Seq(1,2,3,4,5)
seq: Seq[Int] = List(1, 2, 3, 4, 5)

scala> val slide2 = seq.sliding(2)
slide2: Iterator[Seq[Int]] = non-empty iterator
// slide2.toSeq 是一个惰性列表,先创建列表的头部,尾部元素只有在用到的时候才对尾部元素求值。
// 每调用一次就会求一个值,而之前所求的值会弹出,一直调用。该列表将被置空
scala> slide2.toSeq
res4: Seq[Seq[Int]] = Stream(List(1, 2), ?)

scala> slide2.toSeq
res5: Seq[Seq[Int]] = Stream(List(2, 3), ?)

scala> slide2.toSeq
res6: Seq[Seq[Int]] = Stream(List(3, 4), ?)

scala> slide2.toSeq
res7: Seq[Seq[Int]] = Stream(List(4, 5), ?)
scala> slide2.toSeq
res8: Seq[Seq[Int]] = Stream()

// slide2.toList 是一次性创建列表的所有元素,这个对于大序列代价太过于昂贵
scala> slide2.toList
res10: List[Seq[Int]] = List(List(1, 2), List(2, 3), List(3, 4), List(4, 5))


scala> Seq(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).sliding(3,1).toList
res20: List[Seq[Int]] = List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5), List(4, 5, 6), List(5, 6, 7), List(6, 7, 8), List(7, 8, 9), List(8, 9, 10))
scala> Seq(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).sliding(3,2).toList
res18: List[Seq[Int]] = List(List(1, 2, 3), List(3, 4, 5), List(5, 6, 7), List(7, 8, 9), List(9, 10))

scala> Seq(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).sliding(3,3).toList
res19: List[Seq[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9), List(10))

// sliding(x,y) x:返回的列表中有几个元素,y:表示滑动窗口的大小,也可以称为步长

4.7 可变参数列表的匹配

// 定义一个枚举类型,用于表示比较SQL操作符,每个操作符都有一个名字,是一个字符串
object Op extends Enumeration {
    type Op = Value

    val EQ = Value("=")
    // println("EQ: ",EQ)
    val NE = Value("!=")
    val LTGT = Value("<>")
    val LT = Value("<")
    val LE = Value("<=")
    val GT = Value(">")
    val GE = Value(">=")
}

import Op._

// 用于表示 WHERE x OP y 的 case 类
case class WhereOp[T](columnName: String,val1: T,vals: T*)

// 用于表示 WHERE x IN (val1,val2,...)
case class WhereIn[T](columnName: String,val1: T,vals: T*)

// 用于解析示例对象
val Wheres = Seq(
    WhereIn("state","IL","CA","VA"),
    WhereOp("state",EQ,"IL"),
    WhereOp("name",EQ,"Buck Trends"),
    WhereOp("age",GT,"29")
)

// 匹配可变参数的语法形式:name @ _*
for (where <- Wheres) {
    where match {
        case WhereIn(col,val1,vals @ _*) =>
            // val s1 = val1 +: vals = ArrayBuffer(IL, CA, VA) 
            // s1.mkString = IL, CA, VA
            val valStr = (val1 +: vals).mkString(", ")
            println(s"WHERE $col IN ($valStr)")
        case WhereOp(col,op,value) => println(s"WHERE $col $op $value")
        case _ => println(s"ERROR: Unknown expression: $where")
    }

4.8 正则表达式的匹配

/* 
\s 匹配除了\n,\t,\t,\f,\v 以外的任意一个字符
a+ 匹配1个或多个a,最少匹配1个,相当于a{1,}
[^,] 匹配除了逗号以外的任意一个字符
a* 匹配1个a或者多个a,或者0个a
*/
//val BookExtractorRE = """Book: title=([^,]+),\s+author=(.+)""".r
val BookExtractorRE = """Book: title=([^,]+),\s*author=(.+)""".r
val MagazineExtractorRE = """Magazine: title=([^,]+),\s+issue=(.+)""".r

val catalog = Seq(
    // "Book: title=Programming Scala Second Edition, author=Dean Wampler",
    "Book: title=Programming Scala Second Edition,author=Dean Wampler",
    "Magazine: title=The New Yorker, issue=January 2014",
    "Unknown text=Who put this here??"
)

for (item <- catalog) {
    item match {
        case BookExtractorRE(title,author) =>
            println(s"""Book "$title",written by $author""")
        case MagazineExtractorRE(title,issue) =>
            println(s"""Magazine "$title",issue $issue""")
        case entry => println(s"Unrecognized entry: $entry")
    }
}
/*
output:
[root@master scalar]# scala match-regex.sc 
Book "Programming Scala Second Edition",written by Dean Wampler
Magazine "The New Yorker",issue January 2014
Unrecognized entry: Unknown text=Who put this here??
*/
  • 在3个双引号内的正则表达式中使用变量插值是无效的,需要对变量插值进行转义。
s"""$first\\s+$second""".r

s"""$first\s+$second""".r // 这个是错误的
// 这两句的区别在于\s 是否转义
  • 如果正则表达式不是使用3引号括起来,\s 就要转义使用:\\s
  • scala.util.match.Regex 定义了若干个用于正则表达式其他操作方法,如查找和替换

4.9 再谈case语句的变量绑定

case class Address(street: String,city: String,country: String)

case class Person(name: String,age: Int,address: Address)

val alice = Person("Alice",25,Address("1 Scala Lane", "Chicago", "USA"))
val bob = Person("Bob",29,Address("2 Java Ave.","Miamit","USA"))
val charlie = Person("Charlie",32,Address("3 Python Ct.","Boston","USA"))

for (person <- Seq(alice,bob,charlie)) {
    person match {
        // 给变量取别名
        case p @ Person("Alice",25,address) => println(s"Hi Alice! $p")
        case p @ Person("Bob",29,a @ Address(street,city,country)) =>
            println(s"Hi ${p.name}! age ${p.age}, in ${a.city}")
        case p @ Person(name,age,_) =>
            println(s"Who are you,$age year-old person name $name? $p")
    }
}
/*
output:
[root@master scalar]# scala match-deep2.sc 
Hi Alice! Person(Alice,25,Address(1 Scala Lane,Chicago,USA))
Hi Bob! age 29, in Miamit
Who are you,32 year-old person name Charlie? Person(Charlie,32,Address(3 Python Ct.,Boston,USA))
*/

4.10 再谈类型匹配

for { x <- Seq(List(5.5,5.6,5.7),List("a","b"),List(1,2,3,4),List())
} yield (x match {
    //case seqd: Seq[Double] => println("seq double",seqd)
    //case seqs: Seq[String] => println("seq string",seqs)
    case seq: Seq[_] => println(s"Seq ${doSeqMatch(seq)}",seq)
    case _ => ("unknown!",x)
})

def doSeqMatch[T](seq: Seq[T]): String = seq match {
    case Nil => "Nothing"
    case head +: _ => head match {
        case _ : Double => "Double"
        case _ : String => "String"
        case _ => "Unmatched seq element"
    }
}

/*
output:
(Seq Double,List(5.5, 5.6, 5.7))
(Seq String,List(a, b))
(Seq Unmatched seq element,List(1, 2, 3, 4))
(Seq Nothing,List())

*/
/* 
这两句代码会有警告这些警告来源于JVM的类型擦除,类型擦除是Java 5 引入泛型后的一个历史遗留。
为了避免与旧版本的代码断代,jvm的字节码不会记住一个泛型实例(如List) 中实际传入的类型参数的信息
编译器只能识别输入对象为List,但无法在运行时识别它是List[Double] 还是 List[String] 时,编译器会发出警告.
事实上编译器认为第二个匹配List[String]的case子句是不可达代码,意味着第一个匹配List[Doubel] 的case 子句可以匹配任意List。
对于两个输入,都打印机seq double
*/
case seqd: Seq[Double] => println("seq double",seqd)
case seqs: Seq[String] => println("seq string",seqs)

4.11 封闭继承层级与全覆盖匹配

// 定义了一个封闭的抽象基类,由于该类被定义为封闭的,其子类型必须定义在本文件内
sealed abstract class HttpMethod() {
    def body: String
    def bodyLength = body.length
}

// 这8个继承自HttpMethod 的case 类,每个类均在构造方法中声明了参数body:String,
// 由于每个类均为case类,因此该参数是一个val,它实现了HTTPMethod的抽象方法def
// 对封闭基类的实例做模式匹配时,如果case语句覆盖了所有当前文件定义的类型,那么匹配就是全覆盖,由于不允许有其他用于自定义的子类型,随着项目的演进,匹配的全覆盖性也不会丧失
case class Connect(body: String) extends HttpMethod
case class Delete (body: String) extends HttpMethod
case class Get (body: String) extends HttpMethod
case class Head (body: String) extends HttpMethod
case class Options(body: String) extends HttpMethod
case class Post (body: String) extends HttpMethod
case class Put (body: String) extends HttpMethod
case class Trace (body: String) extends HttpMethod

def handle (method: HttpMethod) = method match {
    case Connect (body) => s"connect: (length: ${method.bodyLength}) $body"
    case Delete (body) => s"delete:(length: ${method.bodyLength}) $body"
    case Get (body) => s"get: (length: ${method.bodyLength}) $body"
    case Head (body) => s"head: (length: ${method.bodyLength}) $body"
    case Options (body) => s"options: (length: ${method.bodyLength}) $body"
    case Post (body) => s"post: (length: ${method.bodyLength}) $body"
    case Put (body) => s"put: (length: ${method.bodyLength}) $body"
    case Trace (body) => s"trace: (length: ${method.bodyLength}) $body"
}

val methods = Seq(
    Connect ("connect body..."),
    Delete ("delete body..."),
    Get ("get body..."),
    Head ("head body..."),
    Options ("option body..."),
    Post ("post body..."),
    Put ("put body..."),
    Trace ("trace body..."))

methods foreach (method => println(handle(method)))
/*
output:
[root@master scalar]# scala http.sc
connect: (length: 15) connect body...
delete:(length: 14) delete body...
get: (length: 11) get body...
head: (length: 12) head body...
options: (length: 14) option body...
post: (length: 12) post body...
put: (length: 11) put body...
trace: (length: 13) trace body...
*/
  • HttpMethod 的case 类很小,理论上可以用Enumeration 代替。但那样会有一个很大的缺陷,就是编译器无法判断Enumeration 相应的match语句是否全覆盖。
  • 如果使用了Enumeration,而在match语句中忘记了匹配Trace的语句,只能在运行时抛出MatchError的时候才知道这个错误的存在

4.12 模式匹配的其他用法

// 定义变量的时候使用模式匹配
scala> case class Address(street: String,city: String,country: String)
defined class Address

scala> case class Person(name: String,age: Int,address: Address)
defined class Person

// 只需要一个步骤,就可以将Person中所有的属性抽取出来,同事略过了不需要的属性
scala> val Person(name,age,Address(_,state,_)) = 
     | Person("Dean",29,Address("1 Scala Way","CA","USA"))
name: String = Dean
age: Int = 29
state: String = CA

// List 也可以使用同样的方法取出所有的元素,同时也可以略过不需要的元素
scala> val head +: tail = List(1,2,3)
head: Int = 1
tail: List[Int] = List(2, 3)

scala> val head1 +: head2 +: tail = Vector(1,2,3)
head1: Int = 1
head2: Int = 2
tail: scala.collection.immutable.Vector[Int] = Vector(3)

scala> val head3 +: _  +: tail = List(4,5,6)
head3: Int = 4
tail: List[Int] = List(6)

scala> val Seq(a,b,c) = List(1,2,3)
a: Int = 1
b: Int = 2
c: Int = 3

// 变量的个数与List元素的个数要匹配
scala> val Seq(a,b,c) = List(1,2,3,4)
scala.MatchError: List(1, 2, 3, 4) (of class scala.collection.immutable.$colon$colon)
  ... 32 elided

// if 表达式中也可以使用模式匹配
scala> val p = Person("Dean",29,Address("1 Scala Way","CA","USA"))
p: Person = Person(Dean,29,Address(1 Scala Way,CA,USA))

scala> if (p == Person("Dean",29,Address("1 Scala Way","CA","USA"))) "yes" else "no"
res0: String = yes

scala> if (p == Person("Dean",29,Address("2 Scala Way","CA","USA"))) "yes" else "no"
res1: String = no

// 但是这里无法使用占位符 "_"
scala> if ( p== Person(_,29,Address(_,_,"USA"))) "yes" else "no"
<console>:17: error: missing parameter type for expanded function ((x$1) => p.$eq$eq(Person(x$1, 29, ((x$2, x$3) => Address(x$2, x$3, "USA")))))
       if ( p== Person(_,29,Address(_,_,"USA"))) "yes" else "no"
                       ^
<console>:17: error: missing parameter type for expanded function ((x$2, x$3) => Address(x$2, x$3, "USA"))
       if ( p== Person(_,29,Address(_,_,"USA"))) "yes" else "no"
                                    ^
<console>:17: error: missing parameter type for expanded function ((x$2: <error>, x$3) => Address(x$2, x$3, "USA"))
       if ( p== Person(_,29,Address(_,_,"USA"))) "yes" else "no"
scala> def sum_count(ints: Seq[Int]) = (ints.sum,ints.size)
sum_count: (ints: Seq[Int])(Int, Int)

scala> val (sum,count) = sum_count(List(1,2,3,4,5))
sum: Int = 15
count: Int = 5

for 循环中使用模式匹配

val dogBreeds = Seq(Some("Doberman"),
                None,Some("Yorkshire Terrier"),
                Some("Dachshund"),
                None,Some("Scottish Terrier"),
                None,Some("Great Dane"),Some("Portuguest Water Dog"))

println("second pass: ")

for {
    Some(breed) <- dogBreeds
    upcasedBreed = breed.toUpperCase()

} println(upcasedBreed)
/*
output:
[root@spark1 scala]# scala scoped-option-for.sc 
second pass: 
DOBERMAN
YORKSHIRE TERRIER
DACHSHUND
SCOTTISH TERRIER
GREAT DANE
PORTUGUEST WATER DOG
*/

函数字面量中使用模式匹配

case class Person(name: String,age: Int)

val as = Seq(
    Address("1 Scala Lance","Anytown","USA"),
    Address("2 Clojure Lane","Othertown","USA"))

val ps = Seq(
    Person("Buck Trends",29),
    Person("Clo Jure",28))

val pas = ps zip as

// 不太美观的方法
pas map { tup =>
    val Person(name,age) = tup._1
    val Address(street,city,country) = tup._2
    println(s"name: $name (age: $age) lives at $street,$city,in $country")
}

// 不错的方法。偏函数,在语法上更为简洁,特别是从元组和更复杂的结构中抽取值时,更适用。只是使用偏函数,case表达式必须精确匹配输入,否则在运行时会抛出一个MatchError
pas map {
    case (Person(name,age),Address(street,city,country)) =>
        println(s"name: $name  (age: $age) lives at $street,$city,in $country")
}
/*
output:
[root@spark1 scala]# scala match-fun-args.sc 
name: Buck Trends (age: 29) lives at 1 Scala Lance,Anytown,in USA
name: Clo Jure (age: 28) lives at 2 Clojure Lane,Othertown,in USA
name: Buck Trends  (age: 29) lives at 1 Scala Lance,Anytown,in USA
name: Clo Jure  (age: 28) lives at 2 Clojure Lane,Othertown,in USA
*/

正则表达式使用模式匹配去解构字符串。这里举例SQL解析简单程序

scala> val cols = """\*|[\w,]+"""
cols: String = \*|[\w,]+

scala> val table = """\w+"""
table: String = \w+

scala> val tail = """.*"""
tail: String = .*

// 由于使用了变量插值,所以\s需要转义,如果没有使用变量插值,3重引号下的正则表达式是不需要转义的
scala> val selectRE = s"""SELECT \\s*(DISTINCT)?\\s+($cols)\\s*FROM\\s+($table)\\s*($tail)?;""".r
selectRE: scala.util.matching.Regex = SELECT \s*(DISTINCT)?\s+(\*|[\w,]+)\s*FROM\s+(\w+)\s*(.*)?;

scala> val selectRE(distincts,cols1,table1,otherClauses) = "SELECT DISTINCT * FROM atable;
<console>:1: error: unclosed string literal
val selectRE(distincts,cols1,table1,otherClauses) = "SELECT DISTINCT * FROM atable;
                                                    ^

scala> val selectRE(distincts,cols1,table1,otherClauses) = "SELECT DISTINCT * FROM atable;"
distincts: String = DISTINCT
cols1: String = *
table1: String = atable
otherClauses: String = ""

scala> val selectRE(distinct3,cols3,table3,otherClauses) = "SELECT DISTINCT col1,col2 FROM atable;"
distinct3: String = DISTINCT
cols3: String = col1,col2
table3: String = atable
otherClauses: String = ""


scala> val selectRE(distinct4,cols4,table4,otherClauses) = "SELECT DISTINCT col1,col2 FROM atable WHERE col1 = 'foo';"
distinct4: String = DISTINCT
cols4: String = col1,col2
table4: String = atable
otherClauses: String = WHERE col1 = 'foo'

4.13 总结关于模式匹配的评价

  • 模式匹配是一个强大的”协议”,用于从数据结构中提取数据。
  • JavaBeans 模型,模式匹配鼓励开发者用getter和 setter 暴露对象的属性,而这种做法忽略了一点,即状态应该被封装,只在恰当的时候才暴露出来,尤其对可变的属性而言更是如此。对状态信息的获取应该小心设计,以反映暴露的抽象
  • 设计模式匹配语句时,需要谨慎对待默认的case字句。在什么情况下,才出现”以上均不匹配”的情况呢?默认case字句有可能表明,你该改善一下程序的设计了。这样你会更准确地知道程序中可能发生的所有匹配的情况
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值