我克隆了4台虚拟机(共5台)并分别配置了ip,别名等。
master1 master2 worker1 worker2 worker3
master1,2删掉了zookeeper。安装hadoop我的是2.8.3版本
需要配置(hadoop/etc文件夹下):hadoop-env.sh(java地址)
core-site.xml
<configuration>
<!-- 指定hdfs的nameservice为ns -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns</value>
</property>
<!--指定hadoop数据临时存放目录-->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/user0/app/hadoop-2.6.0/workspace/hdfs/temp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
<!--指定zookeeper地址-->
<property>
<name>ha.zookeeper.quorum</name>
<value>user0-0:2181,user0-1:2181,user0-2:2181,user0-3:2181</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>master1:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>master1:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>master2:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>master2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://worker1:8485;worker2:8485;worker3:8485</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/root/hadoop-2.8.3/journal</value>
</property>
<!-- 开启NameNode故障时自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.ns</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔离机制 -->
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///hadoop-2.8.3/workspace/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///hadoop-2.8.3/workspace/hdfs/data</value>
</property>
<!--备份次数-->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
配置mapred-site.xml
本身不存在 复制一份 cp mapred-site.xml.template mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
配置yarn-site.xml
<configuration>
<!-- 指定nodemanager启动时加载server的方式为shuffle server -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 指定resourcemanager地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master1</value>
</property>
</configuration>
配置slaves
worker1
worker2
worker3
然后把hadoop cps到所有的虚拟机
---------------之后就开始启动
先启动worker3台机器的zookeeper
master两台机器要配置环境变量
export JAVA_HOME=/usr/java/jdk1.8.0_151
export HADOOP_HOME=/root/hadoop-2.8.3
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
然后启动
journalnode
[root@master1 ~]# hadoop-daemons.sh start journalnode
显示
worker1: starting journalnode, logging to /root/hadoop-2.8.3/logs/hadoop-root-journalnode-worker1.out
worker3: starting journalnode, logging to /root/hadoop-2.8.3/logs/hadoop-root-journalnode-worker3.out
worker2: starting journalnode, logging to /root/hadoop-2.8.3/logs/hadoop-root-journalnode-worker2.out
就启动成功了,成功后worker三台机器jps会有JournalNode进程。
然后数据命令 hdfs namenode -format
我在运行的时候了这个错误
ERROR namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA is not enabled.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:710)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:654)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:892)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1310)
应该是某个配置文件写错了。我的是hdfs-site.xml某个name写错了,在nameser
ivces这里一定要注意
然后输入命令
1.格式化zkfc: hdfs zkfc -formatZK
2.格式化hdfs: hadoop namenode -format
出现has been successfully formatted.代表格式化成功
3.启动namenode:在主节点执行命令
sbin/hadoop-daemon.sh start namenode
启动namenode的standby
在主节点备份节点也就是master2上执行命令
hdfs namenode -bootstrapStandby
sbin/hadoop-daemon.sh start namenode
出现has been successfully formatted.代表启动成功
4.启动datanode
在主节点上执行命令
hadoop-daemons.sh start datanode
5.启动yarn
在作为资源管理的节点上执行命令-这里选择master
start-yarn.sh
6.启动ZKFC
在主备节点上执行命令
hadoop-daemon.sh start zkfc
都启动后
访问 http://192.168.137.204:50070(自己的master1的ip:50070)
会显示