Hadoop CDH4.5升级CDH5 以及NameNode和YARN HA实战

CDH5支持很多新特性,所以打算把当前的CDH4.5升级到CDH5,软件部署还是以之前的CDH4.5集群为基础

192.168.1.10    U-1  (Active) hadoop-yarn-resourcemanager  hadoop-hdfs-namenode hadoop-mapreduce-historyserver hadoop-yarn-proxyserver  hadoop-hdfs-zkfc
192.168.1.20    U-2  hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce  journalnode  zookeeper  zookeeper-server
192.168.1.30    U-3  hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce  journalnode  zookeeper  zookeeper-server
192.168.1.40    U-4  hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce  journalnode  zookeeper  zookeeper-server
192.168.1.50    U-5  hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce
192.168.1.70    U-7  (Standby) hadoop-yarn-resourcemanager  hadoop-hdfs-namenode  hadoop-hdfs-zkfc
注意:因为我们是升级CDH4.5到CDH5,所以上表并没有列出来所有要安装的软件,因为在CDH4.5的时候已经安装了一些,所以上面列出的软件只是你升级的时候需要重新安装的。


操作过程如下:

1    Back Up Configuration Data and Stop Services

        1    namenode进入safe mode,保存fsimage

su - hdfs
hdfs dfsadmin -safemode enter
hdfs dfsadmin -saveNamespace

        2    停止集群中的各种hadoop服务

for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x stop ; done

2    Back up the HDFS Metadata

        1    找到dfs.namenode.name.dir

grep -C1 name.dir /etc/hadoop/conf/hdfs-site.xml

        2    备份dfs.namenode.name.dir指定的目录

tar czvf dfs.namenode.name.dir.tgz /data

3    Uninstall the CDH 4 Version of Hadoop

        1    卸载hadoop组件

apt-get remove bigtop-utils bigtop-jsvc bigtop-tomcat sqoop2-client hue-common

        2    删除CDH4的repository files

mv /etc/apt/sources.list.d/cloudera-cdh4.list /root/

4    Download the Latest Version of CDH 5

        1    下载CDH5的repository

wget 'http://archive.cloudera.com/cdh5/one-click-install/precise/amd64/cdh5-repository_1.0_all.deb'

        2    安装CDH5的repository

dpkg -i cdh5-repository_1.0_all.deb 
curl -s http://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh/archive.key |  apt-key add -

5    Install CDH 5 with YARN

        1    安装zookeeper

        2    在各个主机上安装相关组件

                1    Resource Manager host

apt-get install hadoop-yarn-resourcemanager

                2    NameNode host(s)

apt-get install hadoop-hdfs-namenode

                3    All cluster hosts except the Resource Manager

apt-get install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce

                4    One host in the cluster(Active NameNode)

apt-get install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver

                5    All client hosts

apt-get install hadoop-client

6    Install CDH 5 with MRv1

        因为CDH5已经主推YARN了,所以我们不再使用MRv1,就不安装了。

7    In an HA Deployment, Upgrade and Start the Journal Nodes

        1    安装journal nodes

apt-get install hadoop-hdfs-journalnode

        2    启动journal node

service hadoop-hdfs-journalnode start

8    Upgrade the HDFS Metadata

        HA模式和NON-HA模式的升级方式不一样,因为我们之前的CDH4.5是HA模式的,所以我们就按照HA模式的来升级

        1    在active namenode上执行

service hadoop-hdfs-namenode upgrade

        2    重启standby namenode

su - hdfs
hdfs namenode -bootstrapStandby
service hadoop-hdfs-namenode start

        3    启动datanode

service hadoop-hdfs-datanode start

        4    查看版本


9    Start YARN

        1    创建相关目录

su - hdfs
hadoop fs -mkdir /user/history
hdfs fs -chmod -R 1777 /user/history
hdfs fs -chown yarn /user/history
hdfs fs -mkdir /var/log/hadoop-yarn
hdfs fs -chown yarn:mapred /var/log/hadoop-yarn
hadoop fs -ls -R /

        2    在各个hadoop集群集群上启动相关服务

service hadoop-yarn-resourcemanager start
service hadoop-yarn-nodemanager start
service hadoop-mapreduce-historyserver start

10   配置NameNode的HA配置

        1     NameNode HA和CDH4.5的部署一样,只是要把yarn-site.xml中的mapreduce.shuffle修改为mapreduce_shuffle即可。

        2    验证


11    配置YARN的HA配置

        1    Stop all YARN daemons

service hadoop-yarn-nodemanager stop
service hadoop-yarn-resourcemanager stop
service hadoop-mapreduce-historyserver stop

        2    Update the configuration used by the ResourceManagers, NodeManagers and clients

                以下是U-1上的配置,core-site.xml、hdfs-site.xml、mapred-site.xml三个文件都不需要做修改,唯一要修改的是yarn-site.xml

                core-site.xml

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://mycluster/</value>
  </property>

  <property>
    <name>ha.zookeeper.quorum</name>
    <value>U-2:2181,U-3:2181,U-4:2181</value>
  </property>

</configuration>

                hdfs-site.xml

<configuration>
  <property>
     <name>dfs.permissions.superusergroup</name>
     <value>hadoop</value>
  </property>

  <property>
     <name>dfs.namenode.name.dir</name>
     <value>/data</value>
  </property>

  <property>
     <name>dfs.datanode.data.dir</name>
     <value>/data01,/data02</value>
  </property>

  <property>
     <name>dfs.nameservices</name>
     <value>mycluster</value>
  </property>

<!--  HA Config  -->
  <property>
      <name>dfs.ha.namenodes.mycluster</name>
      <value>U-1,U-7</value>
  </property>

  <property>
      <name>dfs.namenode.rpc-address.mycluster.U-1</name>
      <value>U-1:8020</value>
  </property>

  <property>
      <name>dfs.namenode.rpc-address.mycluster.U-7</name>
      <value>U-7:8020</value>
  </property>

  <property>
      <name>dfs.namenode.http-address.mycluster.U-1</name>
      <value>U-1:50070</value>
  </property>

  <property>
      <name>dfs.namenode.http-address.mycluster.U-7</name>
      <value>U-7:50070</value>
  </property>

  <property>
      <name>dfs.namenode.shared.edits.dir</name>
      <value>qjournal://U-2:8485;U-3:8485;U-4:8485/mycluster</value>
  </property>

  <property>
      <name>dfs.journalnode.edits.dir</name>
      <value>/jdata</value>
  </property>

  <property>
      <name>dfs.client.failover.proxy.provider.mycluster</name>
      <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>

  <property>
      <name>dfs.ha.fencing.methods</name>
      <value>sshfence</value>
  </property>

  <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/var/lib/hadoop-hdfs/.ssh/id_rsa</value>
  </property>

  <property>
      <name>dfs.ha.automatic-failover.enabled</name>
      <value>true</value>
  </property>

</configuration>

                mapred-site.xml

<configuration>
 
<property>
 <name>mapreduce.framework.name</name>
 <value>yarn</value>
</property>

<property>
 <name>mapreduce.jobhistory.address</name>
 <value>U-1:10020</value>
</property>
<property>
 <name>mapreduce.jobhistory.webapp.address</name>
 <value>U-1:19888</value>
</property>

</configuration>

                yarn-site.xml

<configuration>
<!-- Resource Manager Configs -->
  <property>
    <name>yarn.resourcemanager.connect.retry-interval.ms</name>
    <value>2000</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>yarn-rm-cluster</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>U-1,U-7</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.id</name>
    <value>U-1</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.store.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
  </property>


  <property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>U-2:2181,U-3:2181,U-4:2181</value>
  </property>

  <property>
    <name>yarn.resourcemanager.zk.state-store.address</name>
    <value>U-1:2181</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
    <value>5000</value>
  </property>

  <!-- RM1 configs -->
  <property>
    <name>yarn.resourcemanager.address.U-1</name>
    <value>U-1:23140</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.U-1</name>
    <value>U-1:23130</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.https.address.U-1</name>
    <value>U-1:23189</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.U-1</name>
    <value>U-1:23188</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address.U-1</name>
    <value>U-1:23125</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.address.U-1</name>
    <value>U-1:23141</value>
  </property>

  <!-- RM2 configs -->
  <property>
    <name>yarn.resourcemanager.address.U-7</name>
    <value>U-7:23140</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.U-7</name>
    <value>U-7:23130</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.https.address.U-7</name>
    <value>U-7:23189</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.U-7</name>
    <value>U-7:23188</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address.U-7</name>
    <value>U-7:23125</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.address.U-7</name>
    <value>U-7:23141</value>
  </property>

<!-- Node Manager Configs -->
  <property>
    <description>Address where the localizer IPC is.</description>
    <name>yarn.nodemanager.localizer.address</name>
    <value>0.0.0.0:23344</value>
  </property>
  <property>
    <description>NM Webapp address.</description>
    <name>yarn.nodemanager.webapp.address</name>
    <value>0.0.0.0:23999</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/yarn/local</value>
  </property>
  <property>
    <name>yarn.nodemanager.log-dirs</name>
    <value>/yarn/log</value>
  </property>
  <property>
    <name>mapreduce.shuffle.port</name>
    <value>23080</value>
  </property>
</configuration>

            注意:在把yarn-site.xml拷贝到U-7后,需要把U-7上的yarn-site.xml的yarn.resourcemanager.ha.id的值修改为U-7,否则ResourceManager启动不了。 

        3    Start all YARN daemons

service hadoop-yarn-resourcemanager start
service hadoop-yarn-nodemanager start

        4    验证


                我勒个去的,这是啥问题,没有找到相应的ZKFC地址?



今天再次实验YARN的HA机制,发现官方的邮件列表有如下解释:

Right now, RM HA does not use ZKFC. So, we can not use this command “yarn rmadmin -failover
rm1 rm2” now.



If you use the default HA configuration, you set up a Automatic RM HA. In order to failover
manually,  you have two options:

set up manual RM HA by set the configuration “yarn.resourcemanager.ha.automatic-failover.enable”
as false. Then you can use command “yarn rmadmin –transitionToActive rm1”, “yarn rmadmin
–transitionToStandby rm2” to control which rm goes to active by yourself.
If you really want to experiment the manual failover when automatic failover enabled, you
can use command “yarn rmadmin –transitionToActive --forcemanual rm2"
Thanks
        原来是我的姿势不对....


        参考:https://issues.apache.org/jira/browse/YARN-3006

                 https://issues.apache.org/jira/browse/YARN-1177



转载于:https://my.oschina.net/guol/blog/269631

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值