Scala中List的ListBuffer实现高效的遍历计算

最新推荐文章于 2023-10-10 08:00:00 发布

w_j_w2010

最新推荐文章于 2023-10-10 08:00:00 发布

阅读量1.3k

点赞数

分类专栏： scala

scala 专栏收录该内容

83 篇文章 0 订阅

订阅专栏

分类： scala 2015-08-16 10:49 309人阅读评论(0) 收藏举报
scalasparkListBuffer

目录(?)[+]

我们通过对比下列4组程序，对比，发现优缺点
第一组：递归

代码

def main(args: Array[String]) {

    val data = 1 to 20000

    val currntTime =System.currentTimeMillis()
    increase(data.toList)
    println("used time=" + (System.currentTimeMillis() - currntTime))

}

def increase(list:List[Int]):List[Int] = list match {
       case List() => List()
       case head2 :: tail => (head2 + 1) :: increase(tail)
}

运行结果：

    Exception in thread “main” java.lang.StackOverflowError
    at scala.collection.LinearSeqOptimizedclass.lengthCompare(LinearSeqOptimized.scala:261)atscala.collection.immutable.List.lengthCompare(List.scala:84)atcom.ifly.edu.scala.list.ListBufferInternals.increase(ListBuffer_Internals.scala:19)
    at com.ifly.edu.scala.list.ListBuffer_Internals.increase(ListBufferInternals.scala:20)atcom.ifly.edu.scala.list.ListBufferInternals.increase(ListBuffer_Internals.scala:20)
    at com.ifly.edu.scala.list.ListBuffer_Internals.increase(ListBufferInternals.scala:20)atcom.ifly.edu.scala.list.ListBufferInternals.increase(ListBuffer_Internals.scala:20)
    at

优点：简单
缺点：当数据过大时，不停创建堆栈，内存消耗大

第二组：循环
代码：

def main(args: Array[String]) {
    val data = 1 to 20000
    val currntTime =System.currentTimeMillis()
    increase_for(data.toList)
    println("used time=" + (System.currentTimeMillis() - currntTime))

}
//循环
def increase_for(list:List[Int]) :List[Int] = {
    var result = List[Int]()
    for(element <- list){
      result = result::: List(element)
    }
    result
}

运行结果
数据大小为20000

    used time=2611
    Process finished with exit code 0

数据大小为2000000

    used time= NIL （运行很长时间，没有结果），难以忍受
    Process finished with exit code 0

优点：规避递归，数据多少不受什么影响
缺点：产生很多临时List结果，当数据过大时，效率降低严重

第三组： For 循环结合map处理
代码

//list 的map function
def increase_for2(list:List[Int]) :List[Int] ={
    println("list map ")
    list map(el => el +1)
}

    1
    2
    3
    4
    5

运行结果
数据大小：2000000

    list map
    used time=2268
    Process finished with exit code 0

数据大小：2000000

    used time=2268
    Process finished with exit code 0数据大小：2000000

数据大小：2000000

    used time=48356

Process finished with exit code 0

优点：不产生中间结果，比使用List的：：：方法快
缺点：
第四组：使用ListBuffer

代码：

def main(args: Array[String]) {
    val data = 1 to 2000000
    val currntTime =System.currentTimeMillis()
    increase_ListBuffer(data.toList)
    println("used time=" + (System.currentTimeMillis() - currntTime))

}

//listBuffer
def increase_ListBuffer(list:List[Int]) :List[Int]={
    import scala.collection.mutable.ListBuffer
    var result = ListBuffer[Int]()
    for(element <- list){
      result += element+1
    }
    result.toList
}

运行结果
数据大小为2000000

    used time=2284
    Process finished with exit code 0

数据大小为20000000

    Exception in thread “main” java.lang.OutOfMemoryError: GC overhead limit exceeded
    at scala.collection.mutable.ListBuffer.pluseq(ListBuffer.scala:168)
    at scala.collection.mutable.ListBuffer.pluseq(ListBuffer.scala:45)
    at scala.collection.generic.Growable
    anonfun
    pluspluseq1.apply(Growable.scala:48)atscala.collection.generic.Growableanonfunpluspluseq1.apply(Growable.scala:48)
    at scala.collection.immutable.Range.foreach(Range.scala:141)
    at scala.collection.generic.Growableclass.pluspluseq(Growable.scala:48)
    at scala.collection.mutable.ListBuffer.pluspluseq(ListBuffer.scala:176)atscala.collection.mutable.ListBuffer.pluspluseq(ListBuffer.scala:45)
    at scala.collection.TraversableLikeclass.to(TraversableLike.scala:629)atscala.collection.AbstractTraversable.to(Traversable.scala:105)atscala.collection.TraversableOnceclass.toList(TraversableOnce.scala:257)
    at scala.collection.AbstractTraversable.toList(Traversable.scala:105)
    at com.ifly.edu.scala.list.ListBuffer_Internals$.main(ListBuffer_Internals.scala:11)
    at com.ifly.edu.scala.list.ListBuffer_Internals.main(ListBuffer_Internals.scala)

优点：数据在一定量的情况，效率非常高
缺点：
小结

ListBuffer 既可以规避递归，也可以规避创建中间结果，效率可靠