Spark On YAR部署记录

环境

centos 7.4 2核 4G 150G * 3
master 10.0.43.241
slave1 10.0.43.242
slave2 10.0.43.243

虚机安装(一台)

先安装一台master
yum install -y net-tools
yum install -y vim
yum install -y wget
yum install -y openssh-clients

[root@master ~]# vi /etc/hosts
[root@master ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.43.241 master
10.0.43.242 slave1
10.0.43.243 slave2
[root@master ~]#
[root@master ~]# systemctl stop firewalld
[root@master ~]# systemctl disable firewalld
[root@master ~]# setenforce 0
[root@master ~]# sed -i ‘s/SELINUX=enforcing/SELINUX=disabled/g’ /etc/selinux/config

在/root目录下准备一个脚本文件sshUtil.sh,用于SSH免密登录配置(后面再执行),内容如下
#!/bin/bash
ssh-keygen -q -t rsa -N “” -f /root/.ssh/id_rsa
ssh-copy-id -i localhost
ssh-copy-id -i master
ssh-copy-id -i slave1
ssh-copy-id -i slave2

[root@master ~]# tar -zxvf jdk-8u221-linux-x64.tar.gz -C /opt
[root@master ~]# vi /etc/profile.d/custom.sh
[root@master ~]# cat /etc/profile.d/custom.sh
#!/bin/bash
#java path
export JAVA_HOME=/opt/jdk1.8.0_221
export PATH= P A T H : PATH: PATH:JAVA_HOME/bin
export CLASSPATH=.: C L A S S P A T H : CLASSPATH: CLASSPATH:JAVA_HOME/lib
[root@master ~]# source /etc/profile.d/custom.sh
[root@master ~]# java -version
java version “1.8.0_221”
Java™ SE Runtime Environment (build 1.8.0_221-b01)
Java HotSpot™ 64-Bit Server VM (build 25.221-b01, mixed mode)
[root@master ~]#

hadoop配置准备

1.准备以下包并解压出来
在这里插入图片描述
2.hadoop-env.sh
[root@master hadoop]# pwd
/opt/hadoop-2.10.0/etc/hadoop
[root@master hadoop]# sed -i ‘s#export JAVA_HOME=${JAVA_HOME}#export JAVA_HOME=/opt/jdk1.8.0_221#’ hadoop-env.sh

3.core-site.xml
[root@master hadoop-2.10.0]# vi etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/var/data/hadoop</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>65536</value>
    </property>
</configuration>

4.hdfs-site.xml
[root@node1 hadoop]# vi hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>slave2:50090</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.https-address</name>
        <value>slave2:50091</value>
    </property>    
</configuration>

5.slaves
[root@master hadoop]# cat slaves
master
slave1
slave2

6.mapred-site.xml
[root@master hadoop-2.10.0]# vi etc/hadoop/mapred-site.xml
[root@master hadoop-2.10.0]# cat etc/hadoop/mapred-site.xml

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> mapreduce.framework.name yarn

7.yarn-site.xml

<?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> yarn.resourcemanager.hostname master yarn.nodemanager.aux-services mapreduce_shuffle

8.配置环境变量
编辑/etc/profile.d/custom.sh,增加如下内容
#hadoop path

export HADOOP_HOME=/opt/hadoop-2.10.0
export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}

clone虚机

clone出另外两台虚机,对应修改ip,hostname
hostnamectl set-hostname slave1
hostnamectl set-hostname slave2

SSH免密操作

三个节点均要执行,按照提示操作
[root@master ~]# sh sshUtil.sh

启动Hadoop集群

1.清空数据
[root@master ~]# rm -rf /tmp/*

2.namenode格式化
[root@master ~]# hdfs namenode -format

3.启动HDFS
[root@master ~]# start-dfs.sh

[root@master ~]# jps
11024 DataNode
11319 Jps
10890 NameNode
11195 SecondaryNameNode
[root@master ~]#

[root@slave2 ~]# jps
1282 Jps
1203 DataNode
[root@slave2 ~]#

[root@slave1 ~]# jps
1027 Jps
1948 DataNode
[root@slave1 ~]#

4.启动YARN
[root@slave1 ~]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-resourcemanager-slave1.out
slave1: starting nodemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-nodemanager-slave1.out
master: starting nodemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-nodemanager-master.out
slave2: starting nodemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-nodemanager-slave2.out
[root@slave1 ~]#

[root@slave1 ~]# jps
8948 DataNode
9079 ResourceManager
9482 Jps
9183 NodeManager
[root@slave1 ~]#

[root@slave2 ~]# jps
7203 DataNode
7433 Jps
7325 NodeManager
[root@slave2 ~]#

[root@master ~]# jps
21024 DataNode
21481 Jps
20890 NameNode
21195 SecondaryNameNode
21371 NodeManager
[root@master ~]#

5.访问WEB
namenode在master节点,resourcemanager在slave1节点,每个节点都有nodemanager

namenode界面:http://10.0.43.241:50070/
resourcemanager界面:http://10.0.43.242:8088/
nodemanager界面:http://10.0.43.241:8042

yarn模式Spark配置

1.重命名
[root@master ~]# mv /opt/spark-3.0.0-bin-hadoop2.7/ /opt/spark-3.0.0

2.配置环境变量
编辑文件/etc/profile.d/custom.sh,增加如下内容
#spark path

export SPARK_HOME=/opt/spark-3.0.0
export PATH=${SPARK_HOME}/bin:${SPARK_HOME}/sbin:$PATH
export SCALA_HOME=/opt/scala-2.11.0
export PATH=$PATH:$SCALA_HOME/bin

[root@master ~]# source /etc/profile.d/custom.sh

3.spark-env.sh 添加

YARN_CONF_DIR=/opt/hadoop-2.10.0/etc/hadoop

4.提交作业

bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode client \
./examples/jars/spark-examples_2.11-2.1.1.jar \
100

Stand alone模式Spark配置

3.spark-env.sh 添加

export LD_LIBRARY_PATH=$JAVA_LIBRARY_PATH
export JAVA_HOME=/opt/jdk1.8.0_221   #Java环境变量
export SCALA_HOME=/opt/scala-2.11.0 #SCALA环境变量
export SPARK_WORKING_MEMORY=1g  #每一个worker节点上可用的最大内存
export HADOOP_HOME=/opt/hadoop-2.10.0  #Hadoop路径
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop #Hadoop配置目录
export SPARK_CLASSPATH=/opt/spark-3.0.0/libext #把MySQL驱动jar包放里面

SPARK_MASTER_IP=master   #驱动器节点IP
SPARK_MASTER_PORT=7077   #驱动器节点端口

4.slave
slave1
slave2

5.分发到其他节点
[root@master conf]#scp -r /opt/spark-3.0.0 root@slave1:/opt/
[root@master conf]#scp -r /opt/spark-3.0.0 root@slave2:/opt/

Spark启动

[root@master sbin]# pwd
/opt/spark-3.0.0/sbin
[root@master sbin] ./start-all.sh

[root@master sbin]# jps
4611 Master
2630 NameNode
5321 Jps
2767 DataNode

[root@slave1 opt]# jps
2820 ResourceManager
4005 Worker
2311 DataNode
4679 Jps
[root@slave1 opt]#

[root@slave2 sbin]# jps
4563 Jps
2149 DataNode
2264 SecondaryNameNode
3737 Worker
[root@slave2 sbin]#

4.提交作业

bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://master:7077
./examples/jars/spark-examples_2.11-2.1.1.jar \
100

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值