Scala 2.8 breakOut

最新推荐文章于 2020-03-30 21:22:56 发布

hankl1990

最新推荐文章于 2020-03-30 21:22:56 发布

阅读量473

点赞数

分类专栏： Scala

Scala 专栏收录该内容

24 篇文章 0 订阅

订阅专栏

转：https://stackoverflow.com/questions/1715681/scala-2-8-breakout/1716558

之前代码里用到，但是不是太明白是怎么回事儿，找了貌似只有这里有比较不错的解释。直接都给粘过来了。

In Scala 2.8, there is an object in scala.collection.package.scala:

def breakOut[From, T, To](implicit b : CanBuildFrom[Nothing, T, To]) =
    new CanBuildFrom[From, T, To] {
        def apply(from: From) = b.apply() ; def apply() = b.apply()
 }

I have been told that this results in:

> import scala.collection.breakOut
> val map : Map[Int,String] = List("London", "Paris").map(x => (x.length, x))(breakOut)

map: Map[Int,String] = Map(6 -> London, 5 -> Paris)

What is going on here? Why is breakOut being called as an argument to my List?

edited May 8 '15 at 21:50

DNA

31.7k965110

asked Nov 11 '09 at 14:53

oxbow_lakes

99k40274419

The trivial answer being, it is not an argument to List, but to map. – Daniel C. Sobral Nov 11 '09 at 18:47

add a comment

4 Answers

active oldest votes

up vote 302 down vote accepted

The answer is found on the definition of map:

def map[B, That](f : (A) => B)(implicit bf : CanBuildFrom[Repr, B, That]) : That

Note that it has two parameters. The first is your function and the second is an implicit. If you do not provide that implicit, Scala will choose the most specific one available.

About breakOut

So, what's the purpose of breakOut? Consider the example given for the question, You take a list of strings, transform each string into a tuple (Int, String), and then produce a Map out of it. The most obvious way to do that would produce an intermediary List[(Int, String)] collection, and then convert it.

Given that map uses a Builder to produce the resulting collection, wouldn't it be possible to skip the intermediary List and collect the results directly into a Map? Evidently, yes, it is. To do so, however, we need to pass a proper CanBuildFrom to map, and that is exactly what breakOut does.

Let's look, then, at the definition of breakOut:

def breakOut[From, T, To](implicit b : CanBuildFrom[Nothing, T, To]) =
  new CanBuildFrom[From, T, To] {
    def apply(from: From) = b.apply() ; def apply() = b.apply()
  }

Note that breakOut is parameterized, and that it returns an instance of CanBuildFrom. As it happens, the types From, T and To have already been inferred, because we know that map is expecting CanBuildFrom[List[String], (Int, String), Map[Int, String]]. Therefore:

From = List[String]
T = (Int, String)
To = Map[Int, String]

To conclude let's examine the implicit received by breakOut itself. It is of type CanBuildFrom[Nothing,T,To]. We already know all these types, so we can determine that we need an implicit of type CanBuildFrom[Nothing,(Int,String),Map[Int,String]]. But is there such a definition?

Let's look at CanBuildFrom's definition:

trait CanBuildFrom[-From, -Elem, +To] 
extends AnyRef

So CanBuildFrom is contra-variant on its first type parameter. Because Nothing is a bottom class (ie, it is a subclass of everything), that means any class can be used in place of Nothing.

Since such a builder exists, Scala can use it to produce the desired output.

About Builders

A lot of methods from Scala's collections library consists of taking the original collection, processing it somehow (in the case of map, transforming each element), and storing the results in a new collection.

To maximize code reuse, this storing of results is done through a builder (scala.collection.mutable.Builder), which basically supports two operations: appending elements, and returning the resulting collection. The type of this resulting collection will depend on the type of the builder. Thus, a List builder will return a List, a Map builder will return a Map, and so on. The implementation of the map method need not concern itself with the type of the result: the builder takes care of it.

On the other hand, that means that map needs to receive this builder somehow. The problem faced when designing Scala 2.8 Collections was how to choose the best builder possible. For example, if I were to write Map('a' -> 1).map(_.swap), I'd like to get a Map(1 -> 'a') back. On the other hand, a Map('a' -> 1).map(_._1) can't return a Map (it returns an Iterable).

The magic of producing the best possible Builder from the known types of the expression is performed through this CanBuildFrom implicit.

About CanBuildFrom

To better explain what's going on, I'll give an example where the collection being mapped is a Map instead of a List. I'll go back to List later. For now, consider these two expressions:

Map(1 -> "one", 2 -> "two") map Function.tupled(_ -> _.length)
Map(1 -> "one", 2 -> "two") map (_._2)

The first returns a Map and the second returns an Iterable. The magic of returning a fitting collection is the work of CanBuildFrom. Let's consider the definition of map again to understand it.

The method map is inherited from TraversableLike. It is parameterized on B and That, and makes use of the type parameters A and Repr, which parameterize the class. Let's see both definitions together:

The class TraversableLike is defined as:

trait TraversableLike[+A, +Repr] 
extends HasNewBuilder[A, Repr] with AnyRef

def map[B, That](f : (A) => B)(implicit bf : CanBuildFrom[Repr, B, That]) : That

To understand where A and Repr come from, let's consider the definition of Map itself:

trait Map[A, +B] 
extends Iterable[(A, B)] with Map[A, B] with MapLike[A, B, Map[A, B]]

Because TraversableLike is inherited by all traits which extend Map, A and Repr could be inherited from any of them. The last one gets the preference, though. So, following the definition of the immutable Map and all the traits that connect it to TraversableLike, we have:

trait Map[A, +B] 
extends Iterable[(A, B)] with Map[A, B] with MapLike[A, B, Map[A, B]]

trait MapLike[A, +B, +This <: MapLike[A, B, This] with Map[A, B]] 
extends MapLike[A, B, This]

trait MapLike[A, +B, +This <: MapLike[A, B, This] with Map[A, B]] 
extends PartialFunction[A, B] with IterableLike[(A, B), This] with Subtractable[A, This]

trait IterableLike[+A, +Repr] 
extends Equals with TraversableLike[A, Repr]

trait TraversableLike[+A, +Repr] 
extends HasNewBuilder[A, Repr] with AnyRef

If you pass the type parameters of Map[Int, String] all the way down the chain, we find that the types passed to TraversableLike, and, thus, used by map, are:

A = (Int,String)
Repr = Map[Int, String]

Going back to the example, the first map is receiving a function of type ((Int, String)) => (Int, Int) and the second map is receiving a function of type ((Int, String)) => String. I use the double parenthesis to emphasize it is a tuple being received, as that's the type of A as we saw.

With that information, let's consider the other types.

map Function.tupled(_ -> _.length):
B = (Int, Int)

map (_._2):
B = String

We can see that the type returned by the first map is Map[Int,Int], and the second is Iterable[String]. Looking at map's definition, it is easy to see that these are the values of That. But where do they come from?

If we look inside the companion objects of the classes involved, we see some implicit declarations providing them. On object Map:

implicit def  canBuildFrom [A, B] : CanBuildFrom[Map, (A, B), Map[A, B]]

And on object Iterable, whose class is extended by Map:

implicit def  canBuildFrom [A] : CanBuildFrom[Iterable, A, Iterable[A]]

These definitions provide factories for parameterized CanBuildFrom.

Scala will choose the most specific implicit available. In the first case, it was the first CanBuildFrom. In the second case, as the first did not match, it chose the second CanBuildFrom.

Back to the Question

Let's see the code for the question, List's and map's definition (again) to see how the types are inferred:

val map : Map[Int,String] = List("London", "Paris").map(x => (x.length, x))(breakOut)

sealed abstract class List[+A] 
extends LinearSeq[A] with Product with GenericTraversableTemplate[A, List] with LinearSeqLike[A, List[A]]

trait LinearSeqLike[+A, +Repr <: LinearSeqLike[A, Repr]] 
extends SeqLike[A, Repr]

trait SeqLike[+A, +Repr] 
extends IterableLike[A, Repr]

trait IterableLike[+A, +Repr] 
extends Equals with TraversableLike[A, Repr]

trait TraversableLike[+A, +Repr] 
extends HasNewBuilder[A, Repr] with AnyRef

def map[B, That](f : (A) => B)(implicit bf : CanBuildFrom[Repr, B, That]) : That

The type of List("London", "Paris") is List[String], so the types A and Repr defined on TraversableLike are:

A = String
Repr = List[String]

The type for (x => (x.length, x)) is (String) => (Int, String), so the type of B is:

B = (Int, String)

The last unknown type, That is the type of the result of map, and we already have that as well:

val map : Map[Int,String] =

So,

That = Map[Int, String]

That means breakOut must, necessarily, return a type or subtype of CanBuildFrom[List[String], (Int, String), Map[Int, String]].

edited Nov 27 '15 at 15:36

203

370415

answered Nov 11 '09 at 16:53

Daniel C. Sobral

237k70419621

Daniel, I can grovel through the types in your answer, but once I get to the end, I feel like I haven't gained any high level understanding. What is breakOut? Where does the name "breakOut" come from (what am I breaking out of)? Why is it needed in this case in order to get a Map out? Surely is there is some way to briefly answer these questions? (even if the lengthy type-groveling remains necessary in order to grasp every detail) – Seth Tisue May 8 '10 at 12:57

@Seth That's a valid concern, but I'm not sure I'm up to the task. The origin of this can be found in here: article.gmane.org/gmane.comp.lang.scala.internals/1812/…. I'll think about it, but, right now, I can't think of much of a way to improve it. – Daniel C. Sobral May 8 '10 at 14:33

Is there a way to avoid specifying the entire result type of Map[Int, String] and instead being able to write something like: 'val map = List("London", "Paris").map(x => (x.length, x))(breakOut[...Map])' – IttayD May 10 '10 at 10:48

A statement, not a question. – retronym Jun 20 '10 at 7:25

@SethTisue From my reading of this explanation, it seems that breakOut is necessary to "break out" of the requirement that your builder needs to build from a List[String]. The compiler wants a CanBuildFrom[List[String], (Int,String), Map[Int,String]], which you cannot provide. The breakOut function does this by clobbering the first type parameter in CanBuildFrom by setting it to Nothing. Now you only have to provide a CanBuildFrom[Nothing, (Int,String), Map[Int,String]]. This is easy because it's provided by the Map class. – Mark Dec 21 '12 at 5:29

show 12 more comments

up vote 82 down vote

I'd like to build upon Daniel's answer. It was very thorough, but as noted in the comments, it doesn't explain what breakout does.

Taken from Re: Support for explicit Builders (2009-10-23), here is what I believe breakout does:

It gives the compiler a suggestion as to which Builder to choose implicitly (essentially it allows the compiler to choose which factory it thinks fits the situation best.)

For example, see the following:

scala> import scala.collection.generic._
import scala.collection.generic._

scala> import scala.collection._
import scala.collection._

scala> import scala.collection.mutable._
import scala.collection.mutable._

scala>

scala> def breakOut[From, T, To](implicit b : CanBuildFrom[Nothing, T, To]) =
     |    new CanBuildFrom[From, T, To] {
     |       def apply(from: From) = b.apply() ; def apply() = b.apply()
     |    }
breakOut: [From, T, To]
     |    (implicit b: scala.collection.generic.CanBuildFrom[Nothing,T,To])
     |    java.lang.Object with
     |    scala.collection.generic.CanBuildFrom[From,T,To]

scala> val l = List(1, 2, 3)
l: List[Int] = List(1, 2, 3)

scala> val imp = l.map(_ + 1)(breakOut)
imp: scala.collection.immutable.IndexedSeq[Int] = Vector(2, 3, 4)

scala> val arr: Array[Int] = l.map(_ + 1)(breakOut)
imp: Array[Int] = Array(2, 3, 4)

scala> val stream: Stream[Int] = l.map(_ + 1)(breakOut)
stream: Stream[Int] = Stream(2, ?)

scala> val seq: Seq[Int] = l.map(_ + 1)(breakOut)
seq: scala.collection.mutable.Seq[Int] = ArrayBuffer(2, 3, 4)

scala> val set: Set[Int] = l.map(_ + 1)(breakOut)
seq: scala.collection.mutable.Set[Int] = Set(2, 4, 3)

scala> val hashSet: HashSet[Int] = l.map(_ + 1)(breakOut)
seq: scala.collection.mutable.HashSet[Int] = Set(2, 4, 3)

You can see the return type is implicitly chosen by the compiler to best match the expected type. Depending on how you declare the receiving variable, you get different results.

The following would be an equivalent way to specify a builder. Note in this case, the compiler will infer the expected type based on the builder's type:

scala> def buildWith[From, T, To](b : Builder[T, To]) =
     |    new CanBuildFrom[From, T, To] {
     |      def apply(from: From) = b ; def apply() = b
     |    }
buildWith: [From, T, To]
     |    (b: scala.collection.mutable.Builder[T,To])
     |    java.lang.Object with
     |    scala.collection.generic.CanBuildFrom[From,T,To]

scala> val a = l.map(_ + 1)(buildWith(Array.newBuilder[Int]))
a: Array[Int] = Array(2, 3, 4)

edited May 6 '13 at 17:34

Peter Mortensen

11.4k1679106

answered Sep 13 '11 at 15:35

Austen Holmes

1,486118

Amazing answer, thanks! – Sergey Kostrukov Aug 9 '15 at 14:44

I wonder why it's named "breakOut"? I'm thinking something like convert or buildADifferentTypeOfCollection (but shorter) might have been easier to remember. – KajMagnus Mar 1 at 8:05

add a comment

up vote 5 down vote

Daniel Sobral's answer is great, and should be read together with Architecture of Scala Collections (Chapter 25 of Programming in Scala).

I just wanted to elaborate on why it is called breakOut:

Why is it called `breakOut`?

Because we want to break out of one type and into another:

Break out of what type into what type? Lets look at the map function on Seq as an example:

Seq.map[B, That](f: (A) -> B)(implicit bf: CanBuildFrom[Seq[A], B, That]): That

If we wanted to build a Map directly from mapping over the elements of a sequence such as:

val x: Map[String, Int] = Seq("A", "BB", "CCC").map(s => (s, s.length))

The compiler would complain:

error: type mismatch;
found   : Seq[(String, Int)]
required: Map[String,Int]

The reason being that Seq only knows how to build another Seq (i.e. there is an implicit CanBuildFrom[Seq[_], B, Seq[B]] builder factory available, but there is NO builder factory from Seq to Map).

In order to compile, we need to somehow breakOut of the type requirement, and be able to construct a builder that produces a Map for the map function to use.

As Daniel has explained, breakOut has the following signature:

def breakOut[From, T, To](implicit b: CanBuildFrom[Nothing, T, To]): CanBuildFrom[From, T, To] =
    // can't just return b because the argument to apply could be cast to From in b
    new CanBuildFrom[From, T, To] {
      def apply(from: From) = b.apply()
      def apply()           = b.apply()
    }

Nothing is a subclass of all classes, so any builder factory can be substituted in place of implicit b: CanBuildFrom[Nothing, T, To]. If we used the breakOut function to provide the implicit parameter:

val x: Map[String, Int] = Seq("A", "BB", "CCC").map(s => (s, s.length))(collection.breakOut)

It would compile, because breakOut is able to provide the required type of CanBuildFrom[Seq[(String, Int)], (String, Int), Map[String, Int]], while the compiler is able to find an implicit builder factory of type CanBuildFrom[Map[_, _], (A, B), Map[A, B]], in place of CanBuildFrom[Nothing, T, To], for breakOut to use to create the actual builder.

Note that CanBuildFrom[Map[_, _], (A, B), Map[A, B]] is defined in Map, and simply initiates a MapBuilder which uses an underlying Map.

Hope this clears things up.

answered Apr 30 at 14:26

Dzhu

2,38632844

add a comment

up vote 2 down vote

A simple example to understand what breakOut does:

scala> import collection.breakOut
import collection.breakOut

scala> val set = Set(1, 2, 3, 4)
set: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4)

scala> set.map(_ % 2)
res0: scala.collection.immutable.Set[Int] = Set(1, 0)

scala> val seq:Seq[Int] = set.map(_ % 2)(breakOut)
seq: Seq[Int] = Vector(1, 0, 1, 0) // map created a Seq[Int] instead of the default Set[Int]

edited Apr 29 at 22:00

answered Oct 17 '16 at 21:09

man

323315

Thanks for the example! Also val seq:Seq[Int] = set.map(_ % 2).toVector will not give you the repeated values as the Set was preserved for the map. – Matthew Pickering Sep 11 at 12:58

@MatthewPickering correct! set.map(_ % 2) creates a Set(1, 0) first, which then gets converted to a Vector(1, 0). – man Sep 12 at 8:45

hankl1990

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Scala 2.8 breakOut

转：https://stackoverflow.com/questions/1715681/scala-2-8-breakout/1716558之前代码里用到，但是不是太明白是怎么回事儿，找了貌似只有这里有比较不错的解释。直接都给粘过来了。In Scala 2.8, there is an object inscala.collection.packag
复制链接

扫一扫