1.
2.集群搭建
(1) hadoop-2.2.0/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_79 |
(2) hadoop-2.2.0/etc/hadoop/core-site.xml
<configuration> <!-- 指定hdfs的nameservice为ns1 --> <property> <name>fs.defaultFS</name> <value>hdfs://ns1</value> </property> <!-- 指定hadoop临时目录 --> <property> <name>hadoop.tmp.dir</name> <value>/usr/hc/hadoop-2.2.0/tmp</value> </property> <!-- 指定zookeeper地址,注意多个之间用英文逗号分隔 --> <property> <name>ha.zookeeper.quorum</name> <value>nameNode:2181,dataNode01:2181,dataNode02:2181</value> </property> </configuration> |
(3) hadoop-2.2.0/etc/hadoop/hdfs-site.xml
<configuration> <!-- 指定hdfs的nameservice为ns1,需要和core-site.xml里面保持一致 --> <property> <name>dfs.nameservcies</name> <value>ns1</value> </property> <!-- ns1下面的NameNode,是nn1 --> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1</value> </property> <!-- ns1下面的NameNode,是nn1 --> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1</value> </property> <!-- nn1的RPC通信地址 --> <property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>nameNode:9000</value> </property> <!-- nn1的http通信地址 --> <property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>nameNode:50070</value> </property> <!-- 指定NameNode的元数据在JournalNade上的存放 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://nameNode:</value> </property> <!-- 指定JournalNode在本地磁盘存放数据 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>/usr/hc/hadoop-2.2.0/jour</value> </property> <!-- 开启NameNode失败自动切换 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- 配置失败自动切换实现方式 --> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 配置隔离机制方法,多个机制用换行分隔 --> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <!-- 使用sshfence隔离机制时需要ssh免登陆 --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <!-- 配置sshfence隔离机制超时时间 --> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>3000</value> </property> </configuration> |
(4) hadoop-2.2.0/etc/hadoop/mapred-site.xml.template 重命名为mapred-site.xml
<configuration> <!-- 指定mr框架为yarn方式 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> |
(5) hadoop-2.2.0/etc/hadoop/yarn-site.xml
<configuration> <!-- 指定resourcemaneger地址--> <property> <name>yarn.resourcemaneger.hostname</name> <value>nameNode</value> </property> <!-- 指定nodemanager启动时加载server --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> |
(6) hadoop-2.2.0/etc/hadoop/slaves
dataNode01 dataNode02 |
六个配置文件修改完了,请严格按照顺序,首先启动zookeeper
(7) 在hadoop-2.2.0/sbin下面有hadoop-daemon.sh和hadoop-daemons.sh分别启动单个进程和多个进程。
启动journalnode
./hadoop-daemon.sh start journalnode |
(8)格式化hdfs
hdfs namenode -fromat |
(9)格式化ZK
hdfs zkfs -formatZK |
(10)分别启动hdfs和yarn再进行测试