Scala 计算算子

Scala 计算算子

集合变换-算子

Scala集合提供了丰富的计算算子,用于实现集合/数组的计算,这些计算子一般针对于List、Array、Set、Map、Range、Vector、Iterator等都可以适用。

排序

  • sorted

    def sorted[B >: String](implicit ord: scala.math.Ordering[B]): List[String]
    
    scala> var list = List("a","c","d","b")
    list: List[String] = List(a, c, d, b)
    
    scala> list.sorted
    res0: List[String] = List(a, b, c, d)
    
    scala> list
    res1: List[String] = List(a, c, d, b)
    

    因为系统已经提供了相应的隐式值Ordering[String],所以用户在使用的时候无需提供,但如果需要自定义排序规则,用户可以自定义,不使用默认的排序规则;

    #自定义排序规则
    scala> var order = new Ordering[String]{
         |       override def compare(x: String, y: String): Int = {
         |          x.compareTo(y) * -1
         |       }
         |     }
    order: Ordering[String] = $anon$1@80b70fd
    
    scala> list.sorted(order)
    res2: List[String] = List(d, c, b, a)
    
    #自定义排序类型
    scala> case class User(id:Int,name:String,salary:Double)
    defined class User
    
    scala> var users=Array(User(1,"张 三",1000.0),User(2,"lisi",1500.0),User(3,"wangwu",800.0))
    users: Array[User] = Array(User(1,张 三,1000.0), User(2,lisi,1500.0), User(3,wangwu,800.0))
    
    scala> users.sorted
    <console>:13: error: No implicit Ordering defined for User.
           users.sorted
                 ^
    #没有规定User类型隐式值报错
    scala> implicit var order = new Ordering[User]{
         |       override def compare(x: User, y: User): Int = {
         |         x.salary.compareTo(y.salary) * -1
         |       }
         |     }
    order: Ordering[User] = $anon$1@2aaf7354
    
    scala> users.sorted
    res4: Array[User] = Array(User(2,lisi,1500.0), User(1,张 三,1000.0), User(3,wangwu,800.0))
    
    
  • sortBy 基于单一属性排序

    def sortBy[B](f: String => B)(implicit ord: scala.math.Ordering[B]): List[String]
    
    scala> users.sortBy(u=>u.salary)
    res6: Array[User] = Array(User(3,wangwu,800.0), User(1,张 三,1000.0), User(2,lisi,1500.0))
    
  • sortWith-指定第二排序规则

    scala> var users=Array(User(1,"张 三",1000.0),User(2,"lisi",1500.0),User(3,"wangwu",800.0),User(4,"张 三",1000.0))
    users: Array[User] = Array(User(1,张 三,1000.0), User(2,lisi,1500.0), User(3,wangwu,800.0), User(4,张 三,1000.0))
    
    scala> users.sortWith((u1,u2)=>{
         |       if(u1.salary!=u2.salary){
         |         (u1.salary.compareTo(u2.salary)) > 0
         |       }else{
         |         (u1.id.compareTo(u2.id)) > 0
         |       }
         |     })
    res9: Array[User] = Array(User(2,lisi,1500.0), User(4,张 三,1000.0), User(1,张 三,1000.0), User(3,wangwu,800.0))
    

flatten

用于展开集合中的元素,主要作用于降维。

def flatten[B](implicit asTraversable: String => scala.collection.GenTraversableOnce[B]): List[B]

第一种写法:

scala> var list=List(List("a","b","c"),List("d","e"))
list: List[List[String]] = List(List(a, b, c), List(d, e))

scala> list.flatten
res0: List[String] = List(a, b, c, d, e)

直接执行此方法会把集合中的元素分散到最小类型

第二种写法:

scala> var lines=List("hello word","ni hao")
lines: List[String] = List(hello word, ni hao)

scala> lines.flatten
res1: List[Char] = List(h, e, l, l, o,  , w, o, r, d, n, i,  , h, a, o)

scala> lines.flatten(l=>l.split("\\s+"))
res2: List[String] = List(hello, word, ni, hao)

scala> lines.flatten(_.split("\\s+"))//变体写法
res6: List[String] = List(hello, word, ni, hao)

Map

​ 该算子可以操作集合的每一个元素,并且对集合中的每一个元素做映射(转换)

scala> var list=List(1,2,4,5)
list: List[Int] = List(1, 2, 4, 5)

scala> list.map(item => item *2 )
res35: List[Int] = List(2, 4, 8, 10)

scala> list.map(_ * 2)
res36: List[Int] = List(2, 4, 8, 10)

scala> var lines=List("Hello World","good good study")
lines: List[String] = List(Hello World, good good study)

scala> lines.flatten(_.split("\\s+")).map(w=>(w.toLowerCase,1))
res47: List[(String, Int)] = List((hello,1), (world,1), (good,1), (good,1), (study,1))

scala> lines.flatten(_.split("\\s+")).map(w=>(w.toLowerCase,1))
res47: List[(String, Int)] = List((hello,1), (world,1), (good,1), (good,1), (study,1))

flatMap

对集合元素先进行转换,然后执行flatten展开降维。

scala> var lines=List("Hello World","good good study")
lines: List[String] = List(Hello World, good good study)

//底层是按照先map在flatten  先按照字符串拆分 一个一个数组 在进行降维
scala> lines.map(line=> line.split(" ")).flatten
res55: List[String] = List(Hello, World, good, good, study)

scala> lines.flatMap(line=> line.split("\\s+")) // 等价写法 lines.flattern(line=>line.split("\\s+"))
res56: List[String] = List(Hello, World, good, good, study)

filter/filterNot

过滤掉集合中不满足条件的元素

def filter(p: ((Int, String, Double)) => Boolean): List[(Int, String, Double)]
scala> var list=List((1,"zhangsan",1000.0),(2,"lisi",800.0))
list: List[(Int, String, Double)] = List((1,zhangsan,1000.0), (2,lisi,800.0))

scala> list.filter(t=>t._3 >= 1000)
res19: List[(Int, String, Double)] = List((1,zhangsan,1000.0))

//变体写法
scala> list.filter(_._3 >= 1000)
res20: List[(Int, String, Double)] = List((1,zhangsan,1000.0))

scala> list.filterNot(_._3 >= 1000)
res21: List[(Int, String, Double)] = List((2,lisi,800.0))

distinct

去除重复数据

scala> val list=List(1,2,2,3)
list: List[Int] = List(1, 2, 2, 3)

scala> list.distinct
res91: List[Int] = List(1, 2, 3)

groupBy

通常用户统计分析,将List或者Array转换为一个Map

def groupBy[K](f: String => K): scala.collection.immutable.Map[K,List[String]]
scala> var list=List("a","b","a","c")

scala> list.groupBy(w=>w)
res61: scala.collection.immutable.Map[String,List[String]] = Map(b -> List(b), a ->List(a, a), c -> List(c))

scala> list.groupBy(w=>w).map(t=>(t._1,t._2.size))//计算出重复字母数量
res63: scala.collection.immutable.Map[String,Int] = Map(b -> 1, a -> 2, c -> 1)
scala> var emps=List("1,001,zhangsan,1000.0","2,002,lisi,1000.0","3,001,王五,800.0")
emps: List[String] = List(1,001,zhangsan,1000.0, 2,002,lisi,1000.0, 3,001,王五,800.0)

scala> case class Employee(id:Int,deptNo:String,name:String,salary:Double)
defined class Employee

scala> emps.map(_.split(",")).map(ts=>Employee(ts(0).toInt,ts(1),ts(2),ts(3).toDouble)).groupBy(emp=>emp.deptNo).map(t=>(t._1,t._2.map(e=>e.salary).sum))
res0: scala.collection.immutable.Map[String,Double] = Map(002 -> 1000.0, 001 -> 1800.0)

scala> emps.map(_.split(",")).map(ts=>Employee(ts(0).toInt,ts(1),ts(2),ts(3).toDouble)).groupBy(emp=>emp.deptNo).map(t=>(t._1,(for(e<- t._2) yield e.salary).sum)).toList.sortBy(t=>t._2).reverse
res1: List[(String, Double)] = List((001,1800.0), (002,1000.0))

max|min

计算最值

def max[B >: Int](implicit cmp: Ordering[B]): Int
def min[B >: Int](implicit cmp: Ordering[B]): Int
scala> list.max
res5: Int = 5

scala> list.min
res6: Int = 1

scala> list.sorted
res7: List[Int] = List(1, 2, 3, 4, 5)

scala> list.sorted.head
res8: Int = 1

scala> list.sorted.last
res9: Int = 5

maxBy|minBy

计算含有最大值或者最小值的记录,按照特定条件求最大或最小

def maxBy[B](f: ((Int, String, Int)) => B)(implicit cmp: Ordering[B]): (Int, String,Int)
def minBy[B](f: ((Int, String, Int)) => B)(implicit cmp: Ordering[B]): (Int, String,Int)
scala> var list=List((1,"zhangsan",28),(2,"lisi",20),(3,"wangwu",18))
list: List[(Int, String, Int)] = List((1,zhangsan,28), (2,lisi,20), (3,wangwu,18))

scala> list.maxBy(t=>t._3)
res12: (Int, String, Int) = (1,zhangsan,28)

scala> list.maxBy(t=>t._1)
res13: (Int, String, Int) = (3,wangwu,18)
scala> var emps=List("1,001,zhangsan,1000.0","2,002,lisi,1000.0","3,001,王五,800.0")
emps: List[String] = List(1,001,zhangsan,1000.0, 2,002,lisi,1000.0, 3,001,王五,800.0)

scala> emps.map(_.split(",")).map(w=>(w(1),w(3))).groupBy(_._1).map(t=>t._2.maxBy(i=>i._2))
res2: scala.collection.immutable.Map[String,String] = Map(002 -> 1000.0, 001 -> 800.0)

scala> emps.map(line=>line.split(",")).map(ts=>(ts(1),ts(3).toDouble)).groupBy(_._1).map(_._2).map(_.maxBy(_._2))
res12: scala.collection.immutable.Iterable[(String, Double)] = List((002,1000.0), (001,1000.0))

reduce|reduceLeft|reduceRight

在这里插入图片描述

 def reduce[A1 >: Int](op: (A1, A1) => A1): A1
scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)

scala> list.reduce((v1,v2)=>v1+v2)
res7: Int = 15

scala> list.reduceLeft((v1,v2)=>v1+v2)
res8: Int = 15

scala> list.reduceRight((v1,v2)=>v1+v2)
res9: Int = 15

scala> list.reduceRight(_+_)
res17: Int = 15

如果集合为空(没有数据),系统报错

scala> var list=List[Int]()
list: List[Int] = List()

scala> list.reduce((v1,v2)=>v1+v2)
java.lang.UnsupportedOperationException: empty.reduceLeft
  at scala.collection.LinearSeqOptimized$class.reduceLeft(LinearSeqOptimized.scala:137)
  at scala.collection.immutable.List.reduceLeft(List.scala:84)
  at scala.collection.TraversableOnce$class.reduce(TraversableOnce.scala:208)
  at scala.collection.AbstractTraversable.reduce(Traversable.scala:104)
  ... 32 elided

fold |foldLeft|foldRight

在这里插入图片描述

def fold[A1 >: Int](z: A1)(op: (A1, A1) => A1): A1
scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)

scala> list.fold(0)((z,v)=> z+v)
res12: Int = 15

scala> var list=List[Int]()
list: List[Int] = List()

scala> list.fold(0)((z,v)=> z+v)
res13: Int = 0

scala> list.fold(0)(_+_)
res19: Int = 15

aggregate

在这里插入图片描述

def aggregate[B](z: => B)(seqop: (B, Int) => B,combop: (B, B) => B): B
scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)

scala> list.aggregate(0)((z,v)=>z+v,(b1,b2)=>b1+b2)
res29: Int = 15

scala> list.aggregate(0)(_+_,_+_)
res33: Int = 15

scala> var list=List[Int]()
list: List[Int] = List()

scala> list.aggregate(0)((z,v)=>z+v,(b1,b2)=>b1+b2)
res31: Int = 0

我们reduce和fold计算要求计算结果类型必须和集合元素类型一致,一般用于求和性质的计算。由于aggregate计算对类型无要求,因此可以使用aggregate完成更复杂的计算逻辑,例如:计算均值

scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)

scala> list.aggregate((0,0.0))((z,v)=>(z._1+1,z._2+v),(b1,b2)=>(b1._1+b2._1,b1._2+b2._2))
res14: (Int, Double) = (5,15.0)
//初始值为元组(0,0.0) z为累计值 v元素 

按照部门计算员工平均薪资

scala> var emps=List("1,001,zhangsan,1000.0","2,002,lisi,1000.0","3,001,王五,800.0")
emps: List[String] = List(1,001,zhangsan,1000.0, 2,002,lisi,1000.0, 3,001,王五,800.0)

scala> emps.map(line=>line.split(",")).map(ts=>(ts(1),ts(3).toDouble)).groupBy(_._1).map(t=>(t._1,t._2.map(_._2))).map(t=>(t._1,t._2.aggregate((0,0.0))((z,v)=>(z._1+1,z._2+v),(b1,b2)=>(b1._1+b1._1,b1._2+b2._2)))).map(t=>(t._1,t._2._2/t._2._1))
res34: scala.collection.immutable.Map[String,Double] = Map(002 -> 1000.0, 001 -> 900.0)

group

可以对一维度数据进行升维度

def grouped(size: Int): Iterator[List[Int]]
scala> var list=List(1,5,3,4,2)
list: List[Int] = List(1, 5, 3, 4, 2)

scala> list.grouped(2)//每隔两个分一组
res73: Iterator[List[Int]] = non-empty iterator

scala> list.grouped(2).toList
res74: List[List[Int]] = List(List(1, 5), List(3, 4), List(2))

zip

将两个一维的集合合并一个一维度的集合

scala> var list1=List(1,5,3,4,2)
list1: List[Int] = List(1, 5, 3, 4, 2)

scala> var list2=List("a","b","c")
list2: List[String] = List(a, b, c)

scala> list2.zip(list1)
res75: List[(String, Int)] = List((a,1), (b,5), (c,3))

scala> list1.zip(list2)
res76: List[(Int, String)] = List((1,a), (5,b), (3,c))

unizp

将一个元组分解成多个一维度集合

scala> var v=List(("a",1),("b",2),("c",3))
v: List[(String, Int)] = List((a,1), (b,2), (c,3))

scala> v.unzip
res90: (List[String], List[Int]) = (List(a, b, c),List(1, 2, 3))

diff|intersect|union

计算差集合、交集、并集

scala> var v=List(1,2,3) 
v: List[Int] = List(1, 2, 3)

scala> v.diff(List(2,3,5)) //差集
res54: List[Int] = List(1)

scala> var v=List(1,2,3,5) 
v: List[Int] = List(1, 2, 3, 5)

scala> v.intersect(List(2,4,6)) //交集
res55: List[Int] = List(2)

scala> var v=List(1,2,3,5) 
v: List[Int] = List(1, 2, 3, 5)

scala> v.union(List(2,4,6)) //并集
res56: List[Int] = List(1, 2, 3, 5, 2, 4, 6)

Sliding

滑动产生新的数组元素

scala> val list=List(1,2,3,4,5,6)
list: List[Int] = List(1, 2, 3, 4, 5, 6)

scala> list.sliding(3,3)  //有几个元素 每次滑动几个元素
res0: Iterator[List[Int]] = non-empty iterator

scala> list.sliding(3,3).toList
res1: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6))

scala> list.sliding(3,1).toList
res2: List[List[Int]] = List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5), List(4, 5, 6))

slice

截取数组子集

scala> val list=List(1,2,3,4,5,6)
list: List[Int] = List(1, 2, 3, 4, 5, 6)

scala> list.slice(0,3)//取从0开始到下标为3的前一个
res3: List[Int] = List(1, 2, 3)

scala> list.slice(3,5)
res5: List[Int] = List(4, 5)

案例剖析

  • 请统计字符出现的次数,并按照次数降序排列
var arrs=Array("this is a demo","good good study","day day up")
scala> arrs.flatMap(_.split(" ")).groupBy(w=>w).map(t=>(t._1,t._2.size)).toList.sortBy(t=>t._2).reverse
res11: List[(String, Int)] = List((day,2), (good,2), (study,1), (a,1), (up,1), (is,1),(demo,1), (this,1))
  • 读取⼀个文本文件,计算字符出现的个数

    var source=Source.fromFile("//文件路径")
    var array=ListBuffer[String]()
    val reader = source.bufferedReader()
    var line = reader.readLine()
    while(line!=null){
     array+=line
     line = reader.readLine()
    }
    array.flatMap(_.split(" "))
     .map((_,1))
     .groupBy(_._1)
     .map(x=> (x._1,x._2.size))
     .toList
     .sortBy(_._2)
     .reverse
     .foreach(println)
    reader.close()
    
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

小七_七七

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值