sbt 配置 akka , spark , 创建eclipse工程

1.安装软件


jdk-8u92-windows-x64.exe

sbt-0.13.13.1.msi

scala-SDK-4.4.1-vfinal-2.11-win32.win32.x86_64.zip


2. 设置依赖库文件下载路径: 


默认的sbt根目录为~/.sbt

默认的sbt的工作目录为~/.sbt/boot

默认的依赖库下载路径为 ~/.ivy2

我们可以设置自己的路径,这样重装系统就不用重新下载依赖,或者放入U盘中,可以多台电脑共用:

修改sbt配置文件:[sbt安装目录]\conf\sbtconfig.txt ,在文件中添加: 

-Dsbt.global.base=D:/sbt
-Dsbt.boot.directory=D:/sbt/boot/
-Dsbt.ivy.home=D:/sbt/ivy/

3. sbt运行时经常需要下载大量的jar包,默认连接到maven官网,速度通常比较慢。


在sbt根目录( ~/.sbt 或者上面修改后的 D:/sbt )下添加一个`repositories`文件,里面内容如下:

[repositories]
local
repox-maven: http://repox.gtan.com:8078/
repox-ivy: http://repox.gtan.com:8078/, [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext]
aliyun: http://maven.aliyun.com/nexus/content/groups/public/
typesafe:http://dl.bintray.com/typesafe/ivy-releases/ , [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext], bootOnly
ivy-sbt-plugin:http://dl.bintray.com/sbt/sbt-plugin-releases/, [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext]
sonatype-oss-releases
maven-central


4. 创建scala项目的目录结构:


├── src
│  ├── main
│  │  ├── java
│  │  ├── resources
│  │  └── scala
├── build.sbt
├── project
│  ├── build.properties
│  ├── plugins.sbt


SBT使用的目录结构和MAVEN类似,在src/main/scala下编写scala代码,在src/main/resources下编写配置文件。


5. 配置工程的scala版本和akka / spark 依赖:

在path_to_project/build.sbt里面定义库依赖:

lazy val root = (project in file(".")).  
  settings(  
    name := "My Project",  
    version := "1.0" ,  
    scalaVersion := "2.11.8",  
    libraryDependencies += "com.typesafe.akka" %% "akka-actor" % "2.4.17",  
    libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.0",
    resolvers += "Akka 2.4.17 Repository" at "http://repo.akka.io/2.4.17/"  
  )  

当需要添加一个新的依赖库的时候,通过Maven Central Repository Search来查找很便捷。


6. 给工程添加sbteclipse插件:

在 path_to_project/project/plugins.sbt中添加:
addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "5.0.1")


7.配置Spark运行环境


https://github.com/steveloughran/winutils  下载 winutils.exe .
将文件保存到一个目录 , 例如 c:\hadoop\bin .
设置环境变量 HADOOP_HOME , 路径为 winutils.exe 所在目录的父目录 , 例如上面的 c:\hadoop 目录.
将 winutils.exe 所在的目录 c:\hadoop\bin 加入系统PATH 环境变量中.
创建 c:\tmp\hive 目录
打开命令行执行 winutils.exe chmod -R 777 \tmp\hive , 给目录权限.
使用命令 winutils.exe ls \tmp\hive 查看是否设置权限成功.


8. 创建eclipse工程:

打开系统cmd , 进入 path_to_project目录。
输入sbt , 打开 sbt console
在sbt console里面执行update命令,下载相应的库到local library repository
在sbt console里面执行eclipse命令,创建eclipse工程 / 更新classpath设置。


9.编写SparkWordCount例子:


使用Scala-Ide 打开工程.

src/main/scala目录下新建 sparkWordCount.scala:

import org.apache.spark.SparkConf  
import org.apache.spark.SparkContext  
import org.apache.spark.SparkContext._  
  
object sparkWordCount {  
  def main(args: Array[String]){    
  
    val conf = new SparkConf().setAppName("myapp").setMaster("local[2]")  
    val sc = new SparkContext(conf)  
    val lines = sc.parallelize( List("Hello World" , "Hello")  )
    lines.flatMap(_.split(" ")).map((_, 1)).reduceByKey(_+_).collect().foreach(println)  
    sc.stop()  
  }    
}  

该例子使用 setMaster("local[2]") 来使得该例子可以在eclipse 本地运行 , 方便本地调试.

鼠标选中sparkWordCount.scala ,  Run As -> Scala Application  , 在eclipse中运行工程,查看Console中输出的结果.


10. 编写 AkkaWordCount例子: 


使用Scala-Ide 打开工程.

src/main/scala目录下新建 akkaWordCount.scala:

import scala.language.postfixOps
import akka.actor._
import akka.routing._
import scala.collection.mutable.ArrayBuffer
import scala.collection.mutable.IndexedSeq
import scala.collection.mutable.HashMap
import akka.util.Timeout
import scala.concurrent.duration._
import scala.concurrent.Await
import akka.pattern.ask

//define message
sealed trait MapReduceMessage;
case class WordCount(word:String , count:Int) extends MapReduceMessage;
case class MapData(dataList : ArrayBuffer[WordCount] ) extends MapReduceMessage
case class ReduceData(reduceDataMap : Map[String,Int] ) extends MapReduceMessage
case class Result() extends MapReduceMessage

//mapActor
class MapActor extends Actor{
  
  def receive : Receive = {
    case message :String => 
       sender ! evaluateExpression(message)
  }
  
  val STOP_WORDS_LIST = List("a", "am", "an", "and", "are", "as", "at",
"be","do", "go", "if", "in", "is", "it", "of", "on", "the", "to")
  
  def evaluateExpression(line : String) : MapData = MapData {
      line.split("""\s+""").foldLeft(ArrayBuffer.empty[WordCount]){
        (index , word)=>
            if (!STOP_WORDS_LIST.contains(word.toLowerCase))
              index += WordCount(word.toLowerCase , 1)
            else
              index
      }
  }
}

//reduceActor 
class ReduceActor extends Actor{
  
  def receive : Receive = {
    case MapData(dataList) => 
      sender ! reduce(dataList)
  }
  
  def reduce(words : IndexedSeq[WordCount] ) : ReduceData = ReduceData{
    words.foldLeft(Map.empty[String,Int]){
      (index , words) => 
        if (index contains words.word)
          index + ( words.word -> ( index.get(words.word).get + 1 ) )
        else
          index + (words.word -> 1)
    } 
  }
}

//AggregateActor
class AggregateActor extends Actor{
  val finalReduceMap = new HashMap[String , Int]
  
  def receive : Receive = {
    case ReduceData(reduceDataMap) =>
      aggregateInMemoryReduce(reduceDataMap)
    case Result =>
      sender ! finalReduceMap.toString()
  }
  
  def aggregateInMemoryReduce(reduceList:Map[String,Int]) : Unit = {
    
    for( (key,value) <- reduceList){
      if (finalReduceMap contains key)
        finalReduceMap(key) = (value + finalReduceMap.get(key).get)
      else
        finalReduceMap += (key -> value)
    }
  }
}

//MasterActor
class MasterActor extends Actor{
  val mapActor = context.actorOf(Props(new MapActor)) , name = "map" )
  val reduceActor:ActorRef = context.actorOf(Props(new ReduceActor),name="reduce")
  val aggregateActor = context.actorOf(Props(new AggregateActor),name = "aggregate")
  
  def receive : Receive = {
    case line : String => mapActor ! line
    case mapData : MapData => reduceActor ! mapData
    case reduceData : ReduceData => aggregateActor ! reduceData
    case Result => aggregateActor forward Result
  }
  
}

object akkaWordCount {
  
  def main(args:Array[String]){

    val _system = ActorSystem("MapReduceApp")
    val master = _system.actorOf(Props(new MasterActor) , name = "master")
    implicit val timeout = Timeout(5 seconds)
    
    master ! "The quick brown fox tried to jump over the lazy dog and fell on the dog"
    master ! "Dog is man's best friend"
    master ! "Dog and Fox belong to the same family"
    
    Thread.sleep(500)
    
    val future = (master ? Result).mapTo[String]
    val result = Await.result(future, timeout.duration)
    println(result)
    _system.shutdown
  }
}

鼠标选中 akkaWordCount.scala ,   Run As -> Scala Application  , 在eclipse中运行工程,查看Console中输出的结果.




  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值