1.在spark中spark-env.sh添加如下:
export HADOOP_CONF_DIR=/home/hadoop/hadoop/hadoop-2.7.6/etc/hadoop/
2.2. 拷贝 yarn-site.xml, hdfs-site.xml, core-site.xml 配置文件到$SPARK_HOME 下,重点是 yarn-site.xml,因为在搭建 spark ha 集群的时候,就已经把 core-site.xml 和 hdfs-site.xml 放置 在这个目录下了。
所以:
S
P
A
R
K
H
O
N
E
/
c
o
n
f
目
录
下
有
h
a
d
o
o
p
的
三
个
配
置
文
件
C
o
r
e
−
s
i
t
e
.
x
m
l
H
d
f
s
−
s
i
t
e
.
x
m
l
Y
a
r
n
−
s
i
t
e
.
x
m
l
!
[
在
这
里
插
入
图
片
描
述
]
(
h
t
t
p
s
:
/
/
i
m
g
−
b
l
o
g
.
c
s
d
n
.
n
e
t
/
20180929152752806
?
w
a
t
e
r
m
a
r
k
/
2
/
t
e
x
t
/
a
H
R
0
c
H
M
6
L
y
9
i
b
G
9
n
L
m
N
z
Z
G
4
u
b
m
V
0
L
3
d
l
a
X
h
p
b
l
80
M
z
I
4
M
z
c
0
O
A
=
=
/
f
o
n
t
/
5
a
6
L
5
L
2
T
/
f
o
n
t
s
i
z
e
/
400
/
f
i
l
l
/
I
0
J
B
Q
k
F
C
M
A
=
=
/
d
i
s
s
o
l
v
e
/
70
)
3.
验
证
:
s
p
a
r
k
−
s
h
e
l
l
−
−
m
a
s
t
e
r
y
a
r
n
−
−
e
x
e
c
u
t
o
r
−
m
e
m
o
r
y
512
m
−
−
t
o
t
a
l
−
e
x
e
c
u
t
o
r
−
c
o
r
e
s
14.
遇
到
报
错
:
如
下
o
r
g
.
a
p
a
c
h
e
.
s
p
a
r
k
.
S
p
a
r
k
E
x
c
e
p
t
i
o
n
:
Y
a
r
n
a
p
p
l
i
c
a
t
i
o
n
h
a
s
a
l
r
e
a
d
y
e
n
d
e
d
!
I
t
m
i
g
h
t
h
a
v
e
b
e
e
n
k
i
l
l
e
d
o
r
u
n
a
b
l
e
t
o
l
a
u
n
c
h
a
p
p
l
i
c
a
t
i
o
n
m
a
s
t
e
r
.
5.
在
SPARK_HONE/conf目录下有hadoop的三个配置文件 Core-site.xml Hdfs-site.xml Yarn-site.xml ![在这里插入图片描述](https://img-blog.csdn.net/20180929152752806?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80MzI4Mzc0OA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70) 3.验证: spark-shell --master yarn --executor-memory 512m --total-executor-cores 1 4.遇到报错:如下 org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. 5. 在
SPARKHONE/conf目录下有hadoop的三个配置文件Core−site.xmlHdfs−site.xmlYarn−site.xml![在这里插入图片描述](https://img−blog.csdn.net/20180929152752806?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80MzI4Mzc0OA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)3.验证:spark−shell−−masteryarn−−executor−memory512m−−total−executor−cores14.遇到报错:如下org.apache.spark.SparkException:Yarnapplicationhasalreadyended!Itmighthavebeenkilledorunabletolaunchapplicationmaster.5.在HADOOP_HOME/etc/hadoop目录下
修改yarn-site.xml,添加如下
yarn.nodemanager.vmem-check-enabled
false
Whether virtual memory limits will be enforced forcontainers
yarn.nodemanager.vmem-pmem-ratio
4
Ratio between virtual memory to physical memory whensetting memory limits for containers
6.重新启动hadoop集群和spark集群
7.成功显示:
8.异常报错:
/home/hadoop/spark/spark-2.3.1-bin-hadoop2.7/bin/spark-shell: line 44: 6590 Killed
跟$SPARK_HONE/conf目录下 spark-env.sh 下的
export HADOOP_CONF_DIR=/home/hadoop/hadoop/hadoop-2.7.6/etc/hadoop/有关
9.使用spark-submit
~/spark/spark-2.3.1-bin-hadoop2.7/bin/spark-submit
–class org.apache.spark.examples.SparkPi
–master yarn
–deploy-mode client
–executor-memory 512m
–total-executor-cores 1
~/spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar
100
正确结果如下: