1.配置jdk
2.复制出两台虚拟机
3.配置hostname (master/slave1/slave2)
vim /etc/sysconfig/network 修改hostname
NETWORKING=yes
HOSTNAME=master
4.配置hosts
vim /etc/hosts
192.168.164.129 master
192.168.164.130 slave2
192.168.164.128 slave1
5.ping 验证
ping -c 3 slave2
Ping -c 3 slave1
Ping -c 3 master
6.每个机器生成秘钥文件
秘钥生成命令:ssh-keygen -t rsa -P ''
root秘钥文件保存目录:/root/.ssh/
查看秘钥:cd /root/.ssh/ 里面包含如下文件:id_rsa id_rsa.pub
每台机器的证书,通过如下命令导入到一个相同的文件
cat id_rsa.pub >> authorized_keys
在每个authorized_keys中添加其他服务器的authorized_keys信息
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDwyWE1fa07nuDV0lcpczMktjX76B2TkjRyirTf4jUB9nQKneWWdCu4tUoTmlMkuG3TRt4gHLxNFbdGUK0nO7rbczUm2AdQwUNmVJKGiBk9A9AYkqbjgB8k3J0cv7PAutVHF+R0jsPArYH0FmDugA....
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCs6RYt1z/JZrXaAXfinffEH8pYzZc11vmNQ7QvSXnBFGeGvHVX/vQhwXgoIONdYyz4pCDUnyWa1d2sbEmq+gN+Dg1I1CPb6kBP5ZZuah5A1IuIFzN+OUU0UqaKnjv+WGJdf4HmxiVy6VUQsMKxo...
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCyjeMfF92+Wvus9JdatBCGR6jKN+ZZvehDQ08fxIhq+gN+Dg1I1CPb6kBP5ZZuah5A1IuIFzN+OUU0UqaKnjv+WGJdf4HmxiVy6VUQsMKxoq+gN+Dg1I1CPb6kBP5ZZuah5A1IuIFzN+OUU0UqaK....
7.测试SSH无密访问
[root@slave2 ~]# ssh master
The authenticity of host 'master (192.168.164.129)' can't be established.
RSA key fingerprint is a3:f0:f3:2e:5f:70:01:6f:d0:3f:b1:14:4a:35:ed:d6.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,192.168.164.129' (RSA) to the list of known hosts.
Last login: Sat Apr 6 19:08:15 2019 from 192.168.164.1
[root@master ~]#
8.解压hadoop后放到/opt/hadoop下
建立目录
mkdir /root/hadoop
mkdir /root/hadoop/tmp
mkdir /root/hadoop/var
mkdir /root/hadoop/dfs
mkdir /root/hadoop/dfs/name
mkdir /root/hadoop/dfs/data
配置环境变量
export HADOOP_HOME=/opt/hadoop/hadoop-3.1.0
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
source /etc/profile
9.修改主机的配置文件
(1)vim /opt/hadoop/hadoop-3.1.0/etc/hadoop/core-site.xml
在<configuration></configuration>间增加
<property>
<name>hadoop.tmp.dir</name>
<value>/root/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
(2)vim /opt/hadoop/hadoop-3.1.0/etc/hadoop/hadoop-env.sh
添加export JAVA_HOME=${JAVA_HOME}
(3)vim /opt/hadoop/hadoop-3.1.0/etc/hadoop/hdfs-site.xml
在<configuration></configuration>间增加
<!-- 设置namenode的http通讯地址 -->
<property>
<name>dfs.namenode.http-address</name>
<value>master:50070</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/root/hadoop/dfs/name</value>
<description>Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently.</description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/root/hadoop/dfs/data</value>
<description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>true</value>
<description>need not permissions</description>
</property>
说明:dfs.permissions配置为false后,可以允许不要检查权限就生成dfs上的文件,方便倒是方便了,但是你需要防止误删除,请将它设置为true,或者直接将该property节点删除,因为默认就是true。
(4)vim /opt/hadoop/hadoop-3.1.0/etc/hadoop/mapred-site.xml
在<configuration></configuration>间增加
<property>
<name>mapred.job.tracker</name>
<value>master:49001</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/root/hadoop/var</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
(5)vim /opt/hadoop/hadoop-3.1.0/etc/hadoop/workers
删除localhost 添加
slave1
slave2
(6)vim /opt/hadoop/hadoop-3.1.0/etc/hadoop/yarn-site.xml
在<configuration></configuration>间增加
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<description>The address of the applications manager interface in the RM.</description>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<description>The address of the scheduler interface.</description>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<description>The http address of the RM web application.</description>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
<description>The https adddress of the RM web application.</description>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>${yarn.resourcemanager.hostname}:8090</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<description>The address of the RM admin interface.</description>
<name>yarn.resourcemanager.admin.address</name>
<value>${yarn.resourcemanager.hostname}:8033</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
<discription>每个节点可用内存,单位MB,默认8182MB</discription>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
说明:yarn.nodemanager.vmem-check-enabled这个的意思是忽略虚拟内存的检查,如果你是安装在虚拟机上,这个配置很有用,配上去之后后续操作不容易出问题。如果是实体机上,并且内存够多,可以将这个配置去掉
10.
配置hadoop-env.sh、yarn-env.sh
修改
export JAVA_HOME=/opt/jdk1.8.0_131
进入master cd /opt/hadoop/hadoop-3.1.0/bin
初始化命令:./hadoop namenode -format
..................
.................
格式化成功后,可以在看到在/root/hadoop/dfs/name/目录多了一个current
cd /opt/hadoop/hadoop-3.1.0/sbin
./start-all.sh
如果是root用户会报错,修改
master,slave都需要修改start-dfs.sh,stop-dfs.sh,start-yarn.sh,stop-yarn.sh四个文件
将start-dfs.sh,stop-dfs.sh两个文件顶部添加以下参数
#!/usr/bin/env bash
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
start-yarn.sh,stop-yarn.sh顶部也需添加以下:
#!/usr/bin/env bash
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
./start-all.sh之后
发现50070能访问,但是8088不能访问
修改yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>${yarn.resourcemanager.hostname}:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
重启后正常
附:master、slave的配置文件、hadoop的所有文件都相同
https://blog.csdn.net/boonya/article/details/80719245
另:scp推送文件
scp -r /opt/hadoop/ root@192.168.164.128:/opt/hadoop将/opt/hadoop目录下的文件包括文件夹推送到192.168.164.128的/opt/hadoop文件夹下
或
scp -r /opt/hadoop slave1:/opt/
二.hbase部署(使用内置的zookeeper)
hbase的成功启动需要namenode、datanode、zookeeper
hbase内置又zookeeper
1.建立目录hbase
root@master opt]# ls
hadoop hbase jdk1.8.0_131 rh
2.解压安装包到该目录
[root@master opt]# cd hbase/
[root@master hbase]# ls
hbase-1.2.0
[root@master hbase]# cd hba
-bash: cd: hba: No such file or directory
[root@master hbase]# cd hbase-1.2.0/
[root@master hbase-1.2.0]# ls
bin CHANGES.txt conf docs hbase-webapps LEGAL lib LICENSE.txt logs NOTICE.txt README.txt
[root@master hbase-1.2.0]# cd conf
[root@master conf]# ls
hadoop-metrics2-hbase.properties hbase-env.cmd hbase-env.sh hbase-policy.xml hbase-site.xml log4j.properties regionservers
[root@master conf]#
3.配置文件
hbase-env.sh
export JAVA_HOME=/opt/jdk1.8.0_131
export HBASE_MANAGES_ZK=true
hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>hdfs://192.168.164.129:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master</name>
<value>master</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave1,slave2</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/data/zookeeprdata</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>60010</value>
</property>
60010是访问端口,是启动成功后通过http://masterIp:60010访问
regionservers
master
slave1
slave2
4.查看
将该目录拷贝到其他两台服务器上,在master上启动hadoop(start-all.sh),然后启动hbase(hbase-start.sh)
jps
[root@master conf]# jps
2400 SecondaryNameNode
2165 NameNode
2663 ResourceManager
6839 HMaster
6776 HQuorumPeer
6955 HRegionServer
9116 Jps
[root@master conf]#
hbseclient:http://192.168.164.129:60010/master-status