最近测试一下spark0.91编译hadoop0.20.2-cdh3u5.
使用sbt/sbt assembly来编译发布。
修改hadoop的版本号,这样才能和HDFS通讯哈。
object SparkBuild extends Build {
// Hadoop version to build against. For example, "1.0.4" for Apache releases, or
// "2.0.0-mr1-cdh4.2.0" for Cloudera Hadoop. Note that these variables can be set
// through the environment variables SPARK_HADOOP_VERSION and SPARK_YARN.
val DEFAULT_HADOOP_VERSION = "0.20.2-cdh3u5"
经常会出现Unresolved Dependency问题,类似如下:
[error] (examples/*:update) sbt.ResolveException: unresolved dependency: commons-lang#commons-lang;2.6: configuration not found in commons-lang#commons-lang;2.6: 'compile'. It was required from org.apache.cassandra#cassandra-all;1.2.6 compile
在网上搜了,说的是依赖关系的问题。所以就在$SPARK_HOME/project/SparkBuild.scala