Hadoop2.x HA(High Avalability,高可用性) 安装部署

1. 准备服务器(我用的是Centos 6.x)

服务器:192.168.0.20   运行服务:active NameNode, ResourceManager  

服务器:192.168.0.21  运行服务:standbyNameNode, NodeManager, JournalNode, DataNode

服务器:192.168.0.199  运行服务:NodeManager, JournalNode,DataNode

服务器:192.168.0.186 运行服务:NodeManager, JournalNode, DataNode


2. 服务器时间要进行同步

date -R
yum install ntp   
/usr/sbin/ntpdate 要同步服务器的IP(ntpdate time.nist.gov)

3. 关闭防火墙(iptables)

查看防火墙:

/etc/init.d/iptables status

关闭防火墙:

/etc/init.d/iptables stop


4. 修改hostname 修改原因可参考此链接:http://blog.csdn.net/shirdrn/article/details/6562292

vim /etc/sysconfig/network

NETWORKING=yes  
NETWORKING_IPV6=yes  
HOSTNAME=localhost.localdomain

修改/etc/sysconfig/network中HOSTNAME的值为localhost,或者自己指定的主机名,保证localhost在/etc/hosts文件中映射为正确的IP地址,然后重新启动网络服务:

/etc/rc.d/init.d/network restart

5. 配置 /etc/hosts文件



6. 配置免密码登陆 免密码登陆转自:http://haitao.iteye.com/blog/1744272

ssh配置

需要配置主机hadoop-nn1无密码登录主机 hadoop-nn1,hadoop-nn2, hadoop-dn1, hadoop-dn2

先确保所有主机的防火墙处于关闭状态。

在主机hadoop-nn1上执行如下:

 a. $cd ~/.ssh
 b. $ssh-keygen -t rsa  -然后一直按回车键,会生成密钥保存在.ssh/id_rsa文件中。
 c. $cp id_rsa.pub authorized_keys


这步完成后,正常情况下就可以无密码登录本机了,即ssh localhost,无需输入密码。


 d. $scp authorized_keys xiaojin@192.168.0.21:/home/xiaojin/.ssh   -把刚刚产生的authorized_keys文件拷一份到主机其他各个主机上.  
 e. $chmod 600 authorized_keys  进入主机B的.ssh目录,改变authorized_keys文件的许可权限。


正常情况下上面几步执行完成后,从主机A所在机器向主机A、主机B所在机器发起ssh连接,只有在第一次登录时需要输入密码,以后则不需要。但是你这里用IP第一次登陆和hosts名第一次登陆是不一样的,如果你hadoop里面配置的是 hosts名 那在免密码第一次登陆的时候用hosts名 先登陆一遍,以后就不用输入密码。

 

可能遇到的问题:


a. 进行ssh登录时,出现:Agent admitted failure to sign using the key .


 $ssh-add


b. 如果无任何错误提示,可以输密码登录,但就是不能无密码登录,在被连接的主机上(如A向B发起ssh连接,则在B上)执行以下几步:


 $chmod o-w ~/
 $chmod 700 ~/.ssh
 $chmod 600 ~/.ssh/authorized_keys


c.如果执行了第2步,还是不能无密码登录,再试试下面几个

  

$ps -Af | grep agent 检查ssh代理是否开启,如果有开启的话,kill掉该代理,然后执行下面,重新打开一个ssh代理,如果没有开启,直接执行下面:
$ssh-agent 还是不行的话,执行下面,重启一下ssh服务
$sudo service sshd restart


7. 安装JDK

vim /etc/profile #修改配置文件
#添加以下内容
#set java environment
JAVA_HOME=/home/xiaojin/jdk1.8.0_73
CLASSPATH=.:$JAVA_HOME/lib.tools.jar
PATH=$JAVA_HOME/bin:$PATH
export JAVA_HOME CLASSPATH PATH


#执行profile文件source /etc/profile

 
 


8. 下载hadoop

#下载 tar文件
tar -zxvf hadoop-2.7.0.tar.gz


9. /tmp目录要有全部的权限,因为hadoop要写入文件


10. hadoop 目录介绍

bin: hadoop 执行文件

sbin: hadoop 启动及执行目录


11. hadoop配置文件修改

#hadoop存放目录下/etc/hadoop/hadoop-env.sh  修改JDK PATH 其他不用修改
export JAVA_HOME=/home/xiaojin/jdk1.8.0_73

#hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
        <name>fs.default.name</name>
        <value>hdfs://hadoop-nn1:8020</value>
        <!-- Hadoop.HA active namenode  -->
</property>
</configuration>

#/hadoop/etc/hadoop/mapred-site.xml
<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
                <description>The runtime framework for executing MapReduce jobs.</description>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>hadoop-nn2:10020</value>
                <description>MapReduce JobHistory Server IPC host:port</description>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>hadoop-nn2:19888</value>
                <description>MapReduce JobHistory Server Web UI host:port</description>
        </property>
</configuration>

#hadoop/etc/hadoop/hdfs-site.xml
<configuration>
        <property>
                <name>dfs.nameservices</name>
                <value>hadoop-server</value>
                <description>Comma-sperated list of nameservices</description>
        </property>

        <property>
                <name>dfs.ha.namenodes.hadoop-server</name>
                <value>nn1,nn2</value>
                <description>The prefix for a given nameservice, contains a comma-sperated list of namenodes for a given nameservice(eg EXAMPLENAMESERVICE)</description>
        </property>

        <property>
                <name>dfs.namenode.rpc-address.hadoop-server.nn1</name>
                <value>hadoop-nn1:8020</value>
                <description>RPC address for namenode1 of hadoop-test</description>
        </property>

        <property>
                <name>dfs.namenode.rpc-address.hadoop-server.nn2</name>
                <value>hadoop-nn2:8020</value>
                <description>RPC address for namenode2 of hadoop-test</description>
        </property>

        <property>
                <name>dfs.namenode.http-address.hadoop-server.nn1</name>
                <value>hadoop-nn1:50070</value>
                <description>The address and the base port where the dfs namenode1 web ui will listion on.</description>
	</property>

        <property>
                <name>dfs.namenode.http-address.hadoop-server.nn2</name>
                <value>hadoop-nn2:50070</value>
                <description>The address and the base port where the dfs namenode2 web ui will listion on.</description>
        </property>

        <property>
                <name>dfs.namenode.name.dir</name>
                <value>file:///home/xiaojin/hadoop/hdfs/name</value>
                <description>Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.</description>
        </property>

        <property>
                <name>dfs.namenode.shared.edits.dir</name>
                <value>qjournal://hadoop-nn2:8485;hadoop-dn1:8485;hadoop-dn2:8485/hadoop-journal</value>
                <description>Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.</description>
        </property>

        <property>
                <name>dfs.datanode.data.dir</name>
                <value>file:///home/xiaojin/hadoop/hdfs/data</value>
                <description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exsit are ignored.</description>
        </property>

        <property>
                <name>dfs.ha.automatic-failover.enabled</name>
                <value>false</value>
                <description>Whether automatic failover is enabled. See the HDFS High Availability documentation for details on automic HA configuration.</description>
        </property>
        
	<property>
                <name>dfs.journalnode.edits.dir</name>
                <value>/home/xiaojin/hadoop/hdfs/journal/</value>
        </property>
</configuration>

#hadoop/etc/hadoop/yarn-site.xml
<configuration>

<!-- Site specific YARN configuration properties -->
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>hadoop-nn1</value>
                <description>The hostname of the RM.</description>
        </property>

        <property>
                <name>yarn.resourcemanager.address</name>
                <value>${yarn.resourcemanager.hostname}:8032</value>
                <description>The address of the appliactions manager interface in the RM.</description>
        </property>

        <property>
                <name>yarn.resourcemanager.scheduler.address</name>
                <value>${yarn.resourcemanager.hostname}:8030</value>
                <description>The address of the scheduler interface.</description>
        </property>

        <property>
                <name>yarn.resoucemanager.webapp.address</name>
                <value>${yarn.resourcemanager.hostname}:8088</value>
                <description>The http address of the RM web appliaction.</description>
        </property>

        <property>
                <name>yarn.resourcemanager.webapp.https.address</name>
                <value>${yarn.resourcemanager.hostname}:8090</value>
                <description>The https addresses of the RM web appliaction.</description>
        </property>

	<property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                <value>${yarn.resourcemanager.hostname}:8031</value>
        </property>

        <property>
                <name>yarn.resourcemanager.admin.address</name>
                <value>${yarn.resourcemanager.hostname}:8033</value>
                <description>The address of the RM admin interface</description>
        </property>

        <property>
                <name>yarn.resourcemanager.scheduler.class</name>
                <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
                <description>The class to use as the resource scheduler.</description>
        </property>

        <property>
                <name>yarn.scheduler.fair.allocation.file</name>
                <value>${yarn.home.dir}/etc/hadoop/fairscheduler.xml</value>
                <description>fair-scheduler conf location.</description>
        </property>

        <property>
                <name>yarn.nodemanager.local-dirs</name>
                <value>/home/xiaojin/hadoop/yarn/local</value>
                <description>List of directories to store localized files in. An application's localized file directory will be found in:${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.Individual containers' work directories, called container_${contid}, will be subdirectories of this.</description>
        </property> 

	<property>
                <name>yarn.log-aggregation-enable</name>
                <value>true</value>
                <description>Whether to enable log aggregation</description>
        </property>

        <property>
                <name>yarn.nodemanager.remote-app-log-dir</name>
                <value>/tmp/logs</value>
                <description>Where to aggregate logs to.</description>
        </property>

        <property>
                <name>yarn.nodemanager.resource.memory-mb</name>
                <value>30720</value>
                <description>Amount of physical memory, in MB, that can be allocated for containers.</description>
        </property>

        <property>
                <name>yarn.nodemanager.resource.cpu-vcores</name>
                <value>12</value>
                <description>Nmuber of CPU cores that can be allocated for containers.</description>
        </property>

        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
                <description>the valid service name should only contain a-zA-z0-9_ and can not start with numbers.</description>
        </property>
</configuration>

#hadoop/etc/hadoop/slaves
hadoop-nn2
hadoop-dn1
hadoop-dn2

#hadoop/etc/hadoop/fairscheduler.xml
<?xml version="1.0" ?>
<allocations>
        <queue name="infrastructure">
                <minResources>102400 mb, 50 vcores </minResources>
                <maxResources>153600 mb, 100vcores </maxResources>
                <maxRunningApps>200</maxRunningApps>
                <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
                <weight>1.0</weight>
                <aclSubmitApps>root,xiaojin,yarn,search,hdfs</aclSubmitApps>
        </queue>

        <queue name="tool">
                <minResources>102400 mb, 30 vcores </minResources>
                <maxResources>153600 mb, 50 vcores </maxResources>
        </queue>

        <queue name="sentiment">
                <minResources>102400 mb, 30 vcores</minResources>
                <maxResources>153500 mb, 50 vcores</maxResources>
        </queue>

</allocations>

12.启动hadoop集群

启动Hadoop集群:
Step1:
在[nn1]上执行启动JournalNode命令
sbin/hadoop-daemons.sh start journalnode

Step2:
在[nn1]上,对其进行格式化,并启动;
bin/hdfs namenode -format
sbin/hadoop-daemon.sh start namenode

Step3:
在[nn2]上,同步nn1的元数据信息;
bin/hdfs namenode -bootstrapStandby

Step4:
启动[nn2]:
sbin/hadoop-daemon.sh start namenode

经过以上四步操作,nn1和nn2均处理standby状态

step5:
将[nn1]切换为Active
bin/hdfs haadmin -transitionToActive nn1

Step6:
在[nn1]上,启动所有datanode
sbin/hadoop-daemons.sh start datanode

启动Yarn服务:
在[nn1]上,启动Yarn服务
sbin/start-yarn.sh

关闭Hadoop集群:
在[nn1]上,输入以下命令
sbin/stop-dfs.sh

13. 执行 MapReduce 示例程序 bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar pi 2 100000



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值