如何在CDH上运行spark application
针对如果在CDH运行spark application,网上有很多教程,但是具体怎么编译的没有说,具体依赖怎么解析
以SBT为例,比如说我的一个应用原来是基于原生1.6.0spark编译的
libraryDependencies ++= {
val sparkV = "1.6.0"
Seq(
"org.apache.spark" %% "spark-core" % sparkV withSources() withJavadoc(),
"org.apache.spark" %% "spark-catalyst" % sparkV withSources() withJavadoc(),
"org.apache.spark" %% "spark-sql" % sparkV withSources() withJavadoc(),
"org.apache.spark" %% "spark-hive" % sparkV withSources() withJavadoc(),
"org.apache.spark" %% "spark-repl" % sparkV withSources() withJavadoc(),
)
}
如果要切换到CDH 5.7 with spark 1.6.0 该怎么写依赖?
// maven官方仓库里是没有cdh的对应依赖的,所以需要添加cdh自己的仓库来解析
resolvers += "cloudera" at "https://repository.cloudera.com/artifactory/cloudera-repos"
libraryDependencies ++= {
val sparkV = "1.6.0-cdh5.7.0"
Seq(
"org.apache.spark" %% "spark-core" % sparkV withSources() withJavadoc(),
"org.apache.spark" %% "spark-catalyst" % sparkV withSources() withJavadoc(),
"org.apache.spark" %% "spark-sql" % sparkV withSources() withJavadoc(),
"org.apache.spark" %% "spark-hive" % sparkV withSources() withJavadoc(),
"org.apache.spark" %% "spark-repl" % sparkV withSources() withJavadoc(),
)
}