【部署yarn】
mapred-site.xml:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
yarn-site.xml:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOME</value>
</property>
【启动yarn】
[hadoop@bigdata31 ~]$ start-yarn.sh
1.4 打开RM web ui :
【参数】:
hdfs:
core-site.xml
hdfs-site.xml
yarn:
mapred-site.xml
yarn-site.xml
hdfs 默认存储目录:linux /tmp/hadoop-hadoop/dfs/name
linux /tmp 只保存文件30天 【这些文件没有发生变更】 存储问题?
*****要修改hdfs 默认存储目录
修改目录 hdfs 存储目录
保留之前存储目录的文件
core-site.xml :
hadoop.tmp.dir => hdfs 存储linux的目录
/tmp/hadoop-${user.name}
【注意】:
先停掉hdfs服务
修改参数
重启
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/data/hadoop</value>
</property>
cp -R /tmp/hadoop-hadoop /home/hadoop/data/hadoop
yarn web ui :Rm 8088
这块要修改 yarn web ui 的端口就可以了
注意:
先停掉yarn服务
修改参数
重启
yarn-site.xml :
yarn.resourcemanager.webapp.address ${yarn.resourcemanager.hostname}:8088
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>bigdata31:9527</value>
</property>
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar wordcount /data/wc.data /out2
【完全分布式部署】
集群划分
hdfs:
【namenode nn
datanode dn
seconderynamenode snn】
yarn :
【resourcemanager rm
nodemanager nm
bigdata32 : nn dn nm
bigdata33 : dn rm nm
bigdata34 :snn dn nm】
准备机器 3台
4G 2cpu 40G
克隆机器
修改:
ip
[root@bigdata30 ~]# vim /etc/sysconfig/network-scripts/ifcfg-ens33
hostname
[root@bigdata30 ~]# vim /etc/hostname
ip映射
[root@bigdata30 ~]# vim /etc/hosts
远程连接3台
ssh 免密登录【三台机器都要做】
[hadoop@bigdata32 ~]$ mkdir app software data shell project
[hadoop@bigdata32 ~]$ ssh-keygen -t rsa 【三台机器都要做】
拷贝公钥 【三台机器都要做】
ssh-copy-id bigdata32
ssh-copy-id bigdata33
ssh-copy-id bigdata34
jdk 部署【三台机器都要做】
1.三台一起做
2.先做一台 同步到其他机器
文件发送命令 :
scp:
scp [[user@]host1:]file1 ... [[user@]host2:]file2
scp bigdata32:~/1.log bigdata33:~
rsync:
rsync [OPTION]... SRC [SRC]... [USER@]HOST:DEST
-av
rsync ~/1.log bigdata34:~
bigdata32:~/1.log: 文件内容发生更新
rsync -av~/1.log bigdata34:~
【编写文件同步脚本】
#!/bin/bash
【三台机器 进行文件发放】
if [ $# -lt 1 ];then
echo "参数不足"
echo "eg:$0 filename..."
fi
【遍历发送文件到 三台机器】
for host in bigdata32 bigdata33 bigdata34
do
echo "=============$host=================="
#1.遍历发送文件的目录
for file in $@
do
【判断文件是否存在】
if [ -e ${file} ];then
pathdir=$(cd $(dirname ${file});pwd)
filename=$(basename ${file})
【同步文件】
ssh $host "mkdir -p $pathdir"
rsync -av $pathdir/$filename $host:$pathdir
else
echo "${file} 不存在"
fi
done
done
【给脚本配置环境变量】:
vim ~/.bashrc
export SHELL_HOME=/home/hadoop/shell
export PATH=${PATH}:${SHELL_HOME}
source ~/.bashrc
jdk 部署【三台机器都要安装】
bigdata32 先安装jdk
[hadoop@bigdata32 software]$ tar -zxvf jdk-8u45-linux-x64.gz -C ~/app/
[hadoop@bigdata32 app]$ ln -s jdk1.8.0_45/ java
[hadoop@bigdata32 app]$ vim ~/.bashrc
#JAVA_HOME
export JAVA_HOME=/home/hadoop/app/java
export PATH=${PATH}:${JAVA_HOME}/bin
[hadoop@bigdata32 app]$ which java
~/app/java/bin/java
[hadoop@bigdata32 app]$ java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode
bigdata32 同步 jdk安装目录 到其他机器 33 34
[hadoop@bigdata32 app]$ xsync java/
[hadoop@bigdata32 app]$ xsync jdk1.8.0_45
[hadoop@bigdata32 app]$ xsync ~/.bashrc
三台机器 source ~/.bashrc
3.6 部署hadoop
先部署bigdata32+ 同步
bigdata32 : nn dn nm
bigdata33 : dn rm nm
bigdata34 :snn dn nm
[hadoop@bigdata32 software]$ tar -zxvf hadoop-3.3.4.tar.gz -C ~/app/
[hadoop@bigdata32 app]$ ln -s hadoop-3.3.4/ hadoop
[hadoop@bigdata32 app]$ vim ~/.bashrc
#HADOOP_HOME
export HADOOP_HOME=/home/hadoop/app/hadoop
export PATH=${PATH}:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin
[hadoop@bigdata32 app]$ source ~/.bashrc
[hadoop@bigdata32 app]$ which hadoop
~/app/hadoop/bin/hadoop
【三台机器一起做】
[hadoop@bigdata32 hadoop]$ pwd
/home/hadoop/data/hadoop
[hadoop@bigdata32 data]$ mkdir hadoop
【配置hdfs】
core-site.xml:
<property>
<name>fs.defaultFS</name>
<value>hdfs://bigdata32:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/data/hadoop</value>
</property>
hdfs-site.xml:
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>bigdata34:9868</value>
</property>
<property>
<name>dfs.namenode.secondary.https-address</name>
<value>bigdata34:9869</value>
</property>
[hadoop@bigdata32 hadoop]$ pwd
/home/hadoop/app/hadoop/etc/hadoop
[hadoop@bigdata32 hadoop]$ cat workers
bigdata32
bigdata33
bigdata34
同步bigdata32内容 到bigdata33 bigdata34
[hadoop@bigdata32 app]$ xsync hadoop
[hadoop@bigdata32 app]$ xsync hadoop-3.3.4
[hadoop@bigdata32 app]$ xsync ~/.bashrc
三台机器都要做souce ~/.bashrc
格式化:
[hadoop@bigdata32 app]$hdfs namenode -format
【格式化操作 部署时候做一次即可】namenode在哪 就在哪台机器格式化
【启动hdfs】:
start-dfs.sh(namenode在哪 就在哪启动)
访问namenode web ui:
【配置yarn】
mapred-site.xml:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
yarn-site.xml:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOME</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>bigdata33</value>
</property>
bigdata32机器 配置文件分发到bigdata33 34:
[hadoop@bigdata32 app]$ xsync hadoop-3.3.4
启动yarn:
start-yarn.sh =>resourcemanager在哪 就在哪启动
注意:去bigdata33启动
访问RM web ui:
bigdata33:8088
【伪分布式】:
hdfs: start-dfs.sh
yarn: start-yarn.sh
启动hadoop:
start-all.sh
关闭hadoop:
stop-all.sh
【完全分布式】 :
可以 start-all.sh查看rm 所在节点
自己编写一个 群起脚本:
[hadoop@bigdata32 ~]$ vim shell/hadoop-cluster
#!/bin/bash
if [ $# -lt 1 ];then
echo "Usage:$0 start|stop"
exit
fi
case $1 in
"start")
echo "========启动hadoop集群========"
echo "========启动 hdfs========"
ssh bigdata32 "/home/hadoop/app/hadoop/sbin/start-dfs.sh"
echo "========启动 yarn========"
ssh bigdata33 "/home/hadoop/app/hadoop/sbin/start-yarn.sh"
;;
"stop")
echo "========停止hadoop集群========"
echo "========停止 yarn========"
ssh bigdata33 "/home/hadoop/app/hadoop/sbin/stop-yarn.sh"
echo "========停止 hdfs========"
ssh bigdata32 "/home/hadoop/app/hadoop/sbin/stop-dfs.sh"
;;
*)
echo "Usage:$0 start|stop"
;;
esac
【编写查看 java 进程的脚本】
[hadoop@bigdata32 ~]$ vim shell/jpsall
for host in bigdata32 bigdata33 bigdata34
do
echo "==========$host========="
ssh $host "/home/hadoop/app/java/bin/jps| grep -v Jps"
done