在list集合里输入文本
- scala> val lines = List(“hello tom hello jerry”, “helo jerry”,“hello kitty”)
lines: List[String] = List(hello tom hello jerry, helo jerry, hello kitty)
按空格切分 - scala> val n1=lines.flatMap.trim((x)=>{x.split("\\s+")})
n1: List[String] = List(hello, tom, hello, jerry, helo, jerry, hello, kitty)
Key为单词 value为1 - scala> val n3=n1.map((x)=>{(x,1)})
n3: List[(String, Int)] = List((hello,1), (tom,1), (hello,1), (jerry,1), (helo,1), (jerry,1), (hello,1), (kitty,1))
以key为标准分组 - scala> val n4=n3.groupBy((x)=>{x._1})
n4: scala.collection.immutable.Map[String,List[(String, Int)]] = Map(kitty -> List((kitty,1)), tom -> List((tom,1)), helo -> List((helo,1)), hello -> List((hello,1), (hello,1), (hello,1)), jerry -> List((jerry,1), (jerry,1)))
统计单词个数 - scala> n4.map((x)=>{(x._1,x._2.size)})
res9: scala.collection.immutable.Map[String,Int] = Map(kitty -> 1, tom -> 1, helo -> 1, hello -> 3, jerry -> 2)