操作系统
vm12.0pro
centos6.5
操作系统可以不用一致
环境变量设置
maven 3.3.9
scala 2.11.8
jdk 1.8
设置环境变量
jdk1.8.0_161
scala-2.11.8
apache-maven-3.3.9
下载对应的tar包 放在linux目录上 解压
tar -zxf .tar -C /opt/modules/
export JAVA_HOME=/opt/modules/jdk1.8.0_161
export PATH=
J
A
V
A
H
O
M
E
/
b
i
n
:
JAVA_HOME/bin:
JAVAHOME/bin:PATH
export SCALA_HOME=/opt/modules/scala-2.11.8
export PATH=
S
C
A
L
A
H
O
M
E
/
b
i
n
:
SCALA_HOME/bin:
SCALAHOME/bin:PATH
export MAVEN_HOME=/opt/modules/apache-maven-3.3.9
export PATH=
M
A
V
E
N
H
O
M
E
/
b
i
n
:
MAVEN_HOME/bin:
MAVENHOME/bin:PATH
源码
下载地址
http://archive.apache.org/dist/
下载对应版本源码
spark-2.2.0.tgz
解压到指定目录
编辑
/opt/modules/spark-2.2.0/dev/make-distribution.sh文件
删除如下文件
VERSION=
(
"
("
("MVN" help:evaluate -Dexpression=project.version
@
2
>
/
d
e
v
/
n
u
l
l
∣
g
r
e
p
−
v
"
I
N
F
O
"
∣
t
a
i
l
−
n
1
)
S
C
A
L
A
V
E
R
S
I
O
N
=
@ 2>/dev/null | grep -v "INFO" | tail -n 1) SCALA_VERSION=
@2>/dev/null∣grep−v"INFO"∣tail−n1)SCALAVERSION=("$MVN" help:evaluate -Dexpression=scala.binary.version KaTeX parse error: Undefined control sequence: \ at position 14: @ 2>/dev/null\̲ ̲ | grep -v "…("$MVN" help:evaluate -Dexpression=hadoop.version KaTeX parse error: Undefined control sequence: \ at position 14: @ 2>/dev/null\̲ ̲ | grep -v "…("$MVN" help:evaluate -Dexpression=project.activeProfiles -pl sql/hive $@ 2>/dev/null
| grep -v “INFO”
| fgrep --count “hive”;
# Reset exit status to 0, otherwise the script stops here if the last grep finds nothing
# because we use “set -o pipefail”
echo -n)
添加如下文件
VERSION=2.2.0
SCALA_VERSION=2.11.8
SPARK_HADOOP_VERSION=2.5.0
SPARK_HIVE=1
开始编译
到sprk源码目录执行
./dev/make-distribution.sh --name custom-spark --tgz -Phadoop-2.5 -Phive -Phive-thriftserver -Pyarn
命令
等待编译完成。