Hadoop完全分布式搭建(三台)
第一步配置host
1.查看hostname
hostname
2.修改hostname
运行时,重启后失效:
hostname newname
永久生效:
vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=yourname
reboot
3.修改hosts文件,把三台机器的映射写到这里面
master masterIP
salve1 salve1IP
salve2 salve2IP
4.实现ssh免密登陆
<!--ssh配置特别注意:.ssh文件夹权限必须为700, authorized_keys权限为600-->
在用户目录下输入以下指令
ssh-keygen -t rsa
连续回车三下,在用户家目录下就会生成.ssh文件夹其中有id_rsa 和id_rsa.pub
然后将公钥给authorized_keys文件
cat id_rsa.pub >> authorized_keys
实现ssh内回环,
ssh loaclhost 或者 ssh master
**注意这里可能会让你输入密码,这就是没配置好,需要修改ssh配置文件,/etc/ssh/sshd_config**
vim /etc/ssh/sshd_config
PubkeyAuthentication yes
RSAAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
将这三句话的注释去掉,若找不到也可以直接加上
然后重启服务 systemctl restart sshd
重新 ssh localhost即可成功
在每一台节点上都运行 ssh-keygen -t rsa
回到master 将authorized_keys 赋给salve1
scp ~/.ssh/authorized_keys hadoop@salve1:~/.ssh/
<!--本次要输入密码-->
然后进入到salve1
将salve1的id_rsa.pub 追加给 authorized_keys
将authorized_keys 赋给salve2
scp ~/.ssh/authorized_keys hadoop@salve2:~/.ssh/
<!--本次要输入密码-->
将salve2的id_rsa.pub 追加给 authorized_keys
至此 salve2的authorized_keys中所有节点的公钥都有了,在将这个authorized_keys赋给maste,和salve1
这样就能够三台机器之间的互相免密登陆
5.解压hadoop tar -zxvf hadoop…
6.配置文件
hadoop-evn.sh --文件中有这俩个配置,需要找到并修改
export JAVA_HOME=/opt/soft/jdk1.8.0_181
export HADOOP_CONF_DIR=/home/hadoop/hadoop-2.7.7/etc/hadoop
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-2.7.7/tmp</value> 此目录要自己创建
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/soft/hadoop/hdfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/soft/hadoop/hdfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:18088</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:18141</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
配置slaves
salve1
salve2
配置master
master
发送给从节点
scp -r hadoop …
启动hadoop
1.格式化hadoop <!--以下操作均在Hadoop目录下-->
bin/hdfs namenode -format
成功后即可启动hadoop集群
sbin/start-all.sh
然后即可查看进程jps