Linux 下 maven 编译 spark 源码

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/yuzongtao/article/details/80728913

1. 安装maven 


1)将安装包解压到指定目录:

[root@master apache-maven-3.5.3]# tar -zxf /opt/maven/apache-maven-3.5.3-bin.tar.gz  -C /usr/local/

2)配置maven环境变量,并测试maven是否安装成功

[root@master apache-maven-3.5.3]# vi /etc/profile
#maven 
export MAVEN_HOME=/usr/local/apache-maven-3.5.3
export PATH=$PATH:$MAVEN_HOME/bin
export MAVEN_OPTS="-Xmx2048m -XX:MetaspaceSize=1024m -XX:MaxMetaspaceSize=1524m -Xss2m"
export PATH=$PATH:$MAVEN_HOME/bin
[root@master apache-maven-3.5.3]# source /etc/profile
[root@master apache-maven-3.5.3]# mvn -version
Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T11:49:05-08:00)
Maven home: /usr/local/apache-maven-3.5.3
Java version: 1.8.0_171, vendor: Oracle Corporation
Java home: /usr/local/jdk1.8.0_171/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-123.el7.x86_64", arch: "amd64", family: "unix"

2.下载Spark源码


1)挂载到/opt目录


2)解压到工作目录

[root@master home]# tar -zxf /opt/spark/spark-2.3.1.tgz  -C /home/andy/work
[root@master home]# cd /home/andy/work
[root@master work]# ll
total 4
drwxrwxr-x. 29 andy andy 4096 Jun  1 13:34 spark-2.3.1
[root@master work]# cd spark-2.3.1/
[root@master spark-2.3.1]# ll
total 228
-rw-rw-r--.  1 andy andy   2318 Jun  1 13:34 appveyor.yml
drwxrwxr-x.  3 andy andy     43 Jun  1 13:34 assembly
drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 bin
drwxrwxr-x.  2 andy andy     75 Jun  1 13:34 build
drwxrwxr-x.  9 andy andy   4096 Jun  1 13:34 common
drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 conf
-rw-rw-r--.  1 andy andy    995 Jun  1 13:34 CONTRIBUTING.md
drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 core
drwxrwxr-x.  5 andy andy     47 Jun  1 13:34 data
drwxrwxr-x.  6 andy andy   4096 Jun  1 13:34 dev
drwxrwxr-x.  9 andy andy   4096 Jun  1 13:34 docs
drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 examples
drwxrwxr-x. 15 andy andy   4096 Jun  1 13:34 external
drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 graphx
drwxrwxr-x.  2 andy andy     20 Jun  1 13:34 hadoop-cloud
drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 launcher
-rw-rw-r--.  1 andy andy  18045 Jun  1 13:34 LICENSE
drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 licenses
drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 mllib
drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 mllib-local
-rw-rw-r--.  1 andy andy  24913 Jun  1 13:34 NOTICE
-rw-rw-r--.  1 andy andy 101718 Jun  1 13:34 pom.xml
drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 project
drwxrwxr-x.  6 andy andy   4096 Jun  1 13:34 python
drwxrwxr-x.  3 andy andy   4096 Jun  1 13:34 R
-rw-rw-r--.  1 andy andy   3809 Jun  1 13:34 README.md
drwxrwxr-x.  5 andy andy     64 Jun  1 13:34 repl
drwxrwxr-x.  5 andy andy     46 Jun  1 13:34 resource-managers
drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 sbin
-rw-rw-r--.  1 andy andy  17624 Jun  1 13:34 scalastyle-config.xml
drwxrwxr-x.  6 andy andy   4096 Jun  1 13:34 sql
drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 streaming
drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 tools

3.编译Spark源码

本本编译Spark源码是接着上一篇CentOS7安装spark2.0集群来写的,所以下图中的工具配置都已经完成:

#scala
export SCALA_HOME=/usr/local/scala-2.12.6
export PATH=$PATH:$SCALA_HOME/bin

#jdk
export JAVA_HOME=/usr/local/jdk1.8.0_171
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin

#spark
export SPARK_HOME=/usr/local/spark-2.3.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
export SPARK_EXAMPLES_JAR=$SPARK_HOME/examples/jars/spark-examples_2.11-2.3.1.jar

1) 设置Maven内存使用,您需要通过MAVEN_OPTS配置Maven的内存使用量,官方推荐配置如下:

export MAVEN_OPTS="-Xmx2048m -XX:MetaspaceSize=1024m -XX:MaxMetaspaceSize=1524m -Xss2m"
export PATH=$PATH:$MAVEN_OPTS/bin

虚拟机推荐设置内存4G,一定要大于MAVEN_OPTS中设置的最大内存。本人一开始给虚拟机设置的内存为1G,编译进程总是会被卡死。

2)编译

[root@master spark-2.3.1]# mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -Phadoop-provided -Phive -Phive-thriftserver -Pnetlib-lgpl -DskipTests clean package
[INFO] Scanning for projects...
Downloading from central: https://repo.maven.apache.org/maven2/org/apache/apache/18/apache-18.pom
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/apache/18/apache-18.pom (16 kB at 4.8 kB/s)
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO] 
[INFO] Spark Project Parent POM                                           [pom]
[INFO] Spark Project Tags                                                 [jar]
[INFO] Spark Project Sketch                                               [jar]
[INFO] Spark Project Local DB                                             [jar]
[INFO] Spark Project Networking                                           [jar]
[INFO] Spark Project Shuffle Streaming Service                            [jar]
[INFO] Spark Project Unsafe                                               [jar]
[INFO] Spark Project Launcher                                             [jar]
[INFO] Spark Project Core                                                 [jar]
[INFO] Spark Project ML Local Library                                     [jar]
[INFO] Spark Project GraphX                                               [jar]
[INFO] Spark Project Streaming                                            [jar]
[INFO] Spark Project Catalyst                                             [jar]
[INFO] Spark Project SQL                                                  [jar]
[INFO] Spark Project ML Library                                           [jar]
[INFO] Spark Project Tools                                                [jar]
[INFO] Spark Project Hive                                                 [jar]
[INFO] Spark Project REPL                                                 [jar]
[INFO] Spark Project YARN Shuffle Service                                 [jar]
[INFO] Spark Project YARN                                                 [jar]
[INFO] Spark Project Hive Thrift Server                                   [jar]
[INFO] Spark Project Assembly                                             [pom]
[INFO] Spark Integration for Kafka 0.10                                   [jar]
[INFO] Kafka 0.10 Source for Structured Streaming                         [jar]
[INFO] Spark Project Examples                                             [jar]
[INFO] Spark Integration for Kafka 0.10 Assembly                          [jar]
[INFO] 
[INFO] -----------------< org.apache.spark:spark-parent_2.11 >-----------------

3)编译成功


展开阅读全文

没有更多推荐了,返回首页