搭建spark集群

登录ied虚拟机

  • 利用lxy_win7虚拟机上的SecureCRT登录ied虚拟机

(二)配置免密登录

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_19,color_FFFFFF,t_70,g_se,x_16

将生成的公钥发送到本机(虚拟机ied)

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_19,color_FFFFFF,t_70,g_se,x_16

验证虚拟机是否能免密登录本机执行命令:ssh ied,再执行命令:exit

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_19,color_FFFFFF,t_70,g_se,x_16

 

下载与Spark版本匹配的Hadoop安装包

 

将Hadoop安装包上传到虚拟机ied的/opt目录

  • 进入/opt目录,然后利watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_19,color_FFFFFF,t_70,g_se,x_16用rz命令上传文件

 

将Hadoop安装包解压到指定目录

  • 执行命令:tar -zxvf hadoop-2.7.1.tar.gz -C /usr/local
    79d710adaf514734880e090150ad8b2f.png

 

 

(六)查看Hadoop的安装目录

1、进入Hadoop安装目录查看

  • 执行命令:cd /usr/local/hadoop-2.7.1 与 ll
    watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_19,color_FFFFFF,t_70,g_se,x_16

 

 

 

查看etc/hadoop子目录

  • 勾出了Hadoop比较重要的配置文件
    watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

 

 

查看sbin子目录

  • 勾出了启动与停止dfs和yarn服务的脚本文件
    watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

 

 

配置Hadoop实现伪分布式

1、修改环境配置文件 - hadoop-env.sh

  • 进入hadoop配置目录,执行命令:vim hadoop-env.sh

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_17,color_FFFFFF,t_70,g_se,x_16

 

需要添加或修改以下内容
export JAVA_HOME=/usr/local/jdk1.8.0_231
export HADOOP_HOME=/usr/local/hadoop-2.7.1
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_OPTS="-Djava.library.path=${HADOOP_HOME}/lib/native"
 

存盘退出,然后执行source hadoop-env.sh,让配置立即生效

修改核心配置文件 - core-site.xml

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_17,color_FFFFFF,t_70,g_se,x_16

 修改分布式文件系统配置文件 - hdfs-site .xml

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 修改MapReduce配置文件 - mapred-site.xml

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 执行命令:vim mapred-site.xml

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 修改yarn配置文件 - yarn-site.xml

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 配置hadoop的环境变量

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

 

  • 在配置Spark单机版时,就配置了SPARK_HOME
  • 存盘退出,执行命令source /etc/profile,让配置生效
  • watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

 创建存放生成文件的临时目录

  • 返回到hadoop安装目录,创建tmp子目录
  • watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_17,color_FFFFFF,t_70,g_se,x_16

 格式化名称节点

执行命令:hdfs namenode -format,格式化名称节点,形成可用的分布式文件系统HDFS

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

 看到22/02/22 21:09:34 INFO common.Storage: Storage directory /usr/local/hadoop-2.7.1/tmp/dfs/name has been successfully formatted.,表明名称节点格式化节点成功
启动与关闭hadoop服务

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

 

执行命令:start-yarn.sh,启动yarn服务 - 分布式计算

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

 执行命令:jps,查看hadoop进程

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_17,color_FFFFFF,t_70,g_se,x_16

 停止hadoop服务

执行命令:stop-dfs.sh,停止dfs服务

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_16,color_FFFFFF,t_70,g_se,x_16

 执行命令:stop-yarn.sh,停止yarn服务

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_16,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_16,color_FFFFFF,t_70,g_se,x_16

 搭建伪分布式Spark

执行命令:cd $SPARK_HOME/conf

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_17,color_FFFFFF,t_70,g_se,x_16

 生成环境配置文件 - spark-env.sh

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_17,color_FFFFFF,t_70,g_se,x_16

 修改环境配置文件 - spark-env.sh

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_17,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_16,color_FFFFFF,t_70,g_se,x_16

 配置spark环境变量\

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_16,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_16,color_FFFFFF,t_70,g_se,x_16

 启动伪分布式Spark

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_16,color_FFFFFF,t_70,g_se,x_16

 启动spark服务

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 

交互式Spark Shell

(一)scala版spark shell

  • 执行命令:spark-shell --master=local
  • watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 在scala>提示符后面执行:quit,退出scala版spark shell

python版spark shell

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 在>>>提示符后执行exit()函数退出python版spark shell

访问Spark WebUI

watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_18,color_FFFFFF,t_70,g_se,x_16

 访问http://192.168.1.110:4040 - 注意端口号是4040

  • 关闭与禁用虚拟机ied的防火墙

  • 执行命令:systemctl stop firewalld.service

  • 执行命令:systemctl disable firewalld.service

  • watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_19,color_FFFFFF,t_70,g_se,x_16

     执行命令:systemctl status firewalld,查看防火墙状态

  • watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_19,color_FFFFFF,t_70,g_se,x_16

     关闭lxy_win7防火墙

  • watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

     访问http://192.168.1.110:4040

  • watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBA5oiR6Z2e5bi454ix5a2m5Lmg,size_20,color_FFFFFF,t_70,g_se,x_16

     

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值