编译需要依赖:
1.scala 下载安装
2.maven 下载安装,修改maven的仓库地址
3.git yum安装
如果使用maven编译的话最后不会生成tar.gz包,所以我们使用以下方式:
编译命令:(--name参数是写上你的hadoop的版本,-Dhadoop.version写上你的hadoop版本,此方式默认的scala版本是2.11.8)
./dev/make-distribution.sh --name 2.6.0-cdh5.7.0 --tgz -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.7.0 -Phive -Phive-thriftserver -Pyarn -Pkubernetes
报错:
[ERROR] Failed to execute goal on project spark-launcher_2.11: Could not resolve dependencies for project org.apache.spark:spark-launcher_2.11:jar:2.4.4: Could not find artifact org.apache.hadoop:hadoop-client:jar:2.6.0-cdh5.7.0 in central (https://repo.maven.apache.org/maven2) -> [Help 1]
解决方法:因为spark的pom文件下默认的是原生态hadoop的包,即org.apache.spark:spark-launcher_2.11:jar:2.4.4,
当然找不到 org.apache.hadoop:hadoop-client:jar:2.6.0-cdh5.7.0, 所以需要配置
<repository>
<id>cloudera</id>
<name>cloudera Repository</name>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
spark编译
最新推荐文章于 2022-12-02 09:28:02 发布