Hadoop手动升级HA配置手册

Hadoop手动升级HA配置手册

1 Hadoop组件升级

本文是Apache hadoop、Hbase升级至CDH版hadoop、Hbase,同时涵盖了Hadoop HA的配置的操作手册..

2 Hadoop升级

2.1 Hadoop升级准备

2.1.1 环境说明

Hadoop原始版本、升级版本分别为:Apache Hadoop 1.2.1,hadoop2.5.0-CDH5.3.3

2.1.2 升级准备

2.1.2.1 升级JDK

#如果JDK版本已经是1.7以上,此步可略过

rpm –ivh oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm

注:默认安装在/usr/java目录内

2.1.2.2 停Hbase相关外围应用、停Hbase服务

stop-hbase.sh

注:此时zookeeper和Hadoop相关主进程皆不需要停.

2.1.2.3 备份Namenode元数据

#hadoop先进入安全模式,合并edits并备份namenode元数据

hadoop dfsadmin -safemode enter

hadoop dfsadmin -saveNamespace

stop-all.sh

cp  /app/data/name/*/app/data/name_bak/

注:这里的/app/data/name/来至Hadoop1.2.1里hdfs-site.xmldfs.name.dir的配置

2.1.2.4 上传新版Hadoop并做好相关配置文件的设置

#上传并解压安装包hadoop-2.5.0-cdh5.3.3.tar.gz到namenode所在机器上,如:/app/

tar –zxvf hadoop-2.5.0-cdh5.3.3.tar.gz

# 检查主节点安装包执行目录是否有执行权限

#配置如下参数文件:

#core-site.xml,hadoop-env.sh,hdfs-site.xml,mapred-site.xml,slaves,yarn-site.xml,yarn-evn.sh,master

a)  配置core-site.xml

#基本沿用1代时的配置.

b)  hadoop-env.sh

exportJAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/

export HADOOP_HEAPSIZE=70000

c)  hdfs-site.xml

#修改参数如下参数的名称以兼容2代, 其它参数沿用.

#修改参数dfs.name.dir为dfs.namenode.name.dir

#修改参数dfs.data.dir为dfs.datanode.data.dir

d)  mapred-site.xml

#新增yarn参数,之前1代的参数可以注释掉

<property>
     <name>mapreduce.framework.name</name>
     <value>yarn</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/shenl/hadoop1.2.1/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/shenl/hadoop1.2.1/local</value>
</property>

e)  slaves

沿用1代的配置

f)  yarn-site.xml

<property> 
              <name>yarn.resourcemanager.address</name> 
              <value>master1:8032</value> </property> 
     <property> 
              <name>yarn.resourcemanager.scheduler.address</name> 
              <value>master1:8030</value> 
     </property> 
     <property> 
              <name>yarn.resourcemanager.resource-tracker.address</name> 
              <value>master1:8031</value> 
     </property>
 
     <property>
              <name>yarn.resourcemanager.admin.address</name> 
              <value>master1:8033</value> 
     </property> <property> 
              <name>yarn.resourcemanager.webapp.address</name> 
              <value>master1:8088</value> 
     </property> 
     <property> 
              <name>yarn.nodemanager.aux-services</name> 
              <value>mapreduce_shuffle</value> 
     </property> 
     <property> 
              <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 
              <value>org.apache.hadoop.mapred.ShuffleHandler</value> 
     </property>

:这里master1是namenode的主机名

g)  yarn-env.sh

#修改JAVA_HOME

exportJAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/

h)  master

#新增master文件,填写namenode的主机名

2.1.2.5 在Namenode节点里分发已经配置好的2代hadoop

scp -rq /app/hadoop-2.5.0-cdh5.3.3 hadoop@datanode1:/app/&&

scp -rq /app/hadoop-2.5.0-cdh5.3.3 hadoop@datanode2:/app/&&

2.1.2.6 在所有节点上配置HADOOP_HOME被生效

vi ~/bash_profile

export HADOOP_HOME=/home/shenl/hadoop-2.5.0-cdh5.3.3

source ~/.bash_profile

which hadoop 

which hadoop-daemon.sh

2.2 Hadoop升级回滚

2.1升级Hadoop

2.1.1 升级Namenode

#观察namenode日志

tail-f hadoop-hadoop-namenode-bigdata01.log

hadoop-daemon.shstart namenode -upgrade

:Namenode日志稳定后,即可任务升级成功

2.1.2 升级Datanode

#升级Datanode,可以在Namenode里对所有时间节点同时升级

hadoop-daemons.shstart datanode

#数据节点日志如下:

:在namenode日志里看到所有的数据节点成功方可认为升级完成

2.1.3 升级Datanode

#namenode节点上执行, YARN验证

yarn-daemon.sh startresourcemanager

yarn-daemons.shstart nodemanager

mr-jobhistory-daemon.shstart historyserver

#执行wordcount验证:

hadoopjar/home/shenl/hadoop-2.5.0-cdh5.3.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.3.jarwordcount /shenl/gc.log /shenl3/

注:检测控制台执行过程 map 100% reduce 100% 即任务执行成功 或则到 8088端口查看作业情况.

2.2 回滚hadoop

#还原回1代Hadoop的环境变量,并生效,参数文件指向1代

2.2.2.1 回滚Namenode

hadoop-daemon.shstart namenode -rollback

2.2.2.2 回滚Datanode

hadoop-daemons.shstart datanode –rollback

3 Hbase升级

3.1 Hadoop升级准备

3.1.1 环境说明

Hbase原始版本、升级版本分别为:Hbase 0.96.1.1 ,hbase0.98.6-cdh5.3.3

3.1.2 升级准备

3.1.2.1 上传并解压安装文件

上传并解压安装包(hbase-0.98.6-cdh5.3.3.tar.gz)到Hmaster机器上,如目录:/app/

tar–zxvf  hbase-0.98.6-cdh5.3.3.tar.gz

3.1.2.2 修改hbase-env.sh里的相关参数

#修改引用的JDK

export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/

export  HBASE_HEAPSIZE=8000

export HBASE_PID_DIR=/home/shenl/pids/hbase96

3.1.2.3 拷贝1代hbase的相关参数

#拷贝1代hbase的conf下的Hbase-site.xml、regionserver到2代的conf下.

3.1.2.4 拷贝hmaster里hbase到各slave节点

#分发2代hbase到各个节点

scp -r /app/hbase-0.98.6-cdh5.3.3hadoop@datanode1:/app/

3.1.2.5 各节点里修改Hbase的环境变量

#修改用户的环境变量,指定$HBASE_HOME并追加$HBASE_HOME/bin到PATH

vi ~/.bash_profile

exportHBASE_HOME=/home/impala/hbase-0.98.6-cdh5.3.3

source ~/.bash_profile

3.2 Hbase升级回滚

4.1 升级Hbase

执行升级检查和升级命令(Hmaster节点)

hbase upgrade –check

hbase upgrade –execute

#启动hbase (Hmaster节点上执行)

start-hbase.sh

4.2回滚hbase

修改回之前的环境变量,生效后启动即可.

5 Hadoop HA配置

5.1 Hadoop Yarn HA配置

5.1.1 hdfs-site.xml参数配置,注意看HASupport部分

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>
 
<configuration>
 
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/shenl/home/impala/data/hadoop1.2.1</value>
</property>
 
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/shenl/home/impala/dfs_data01/dfs1.2.1</value>
</property>
 
 
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
 
<property>
<name>hadoop.tmp.dir</name>
<value>/home/shenl/var/hadoop1.2.1/tmp</value>
</property>
 
<property>
<name>dfs.http.address</name>
<value>master1:50070</value>
</property>
 
<property>
<name>dfs.datanode.du.reserved</name>
<value>10737418240</value>
</property>
 
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
 
<property>
<name>dfs.secondary.http.address</name>
<value>data1:50090</value>
<description>
The secondary namenode http server address and port.
If the port is 0 then the server will start on a freeport.
</description>
 
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
 
<property>
<name>dfs.hosts.exclude</name>
<value>/home/shenl/home/impala/src/hadoop-1.2.1/conf/slaves.ex</value>
</property>
 
<!-- HA Configure -->
<property>
   <name>dfs.nameservices</name>
   <value>zzg</value>
</property>
<property>
   <name>dfs.ha.namenodes.zzg</name>
   <value>master1,data1</value>
</property>
<property>
   <name>dfs.namenode.rpc-address.zzg.master1</name>
   <value>master1:9000</value>
</property>
<property>
   <name>dfs.namenode.rpc-address.zzg.data1</name>
   <value>data1:9000</value>
</property>
<property>
    <name>dfs.namenode.http-address.zzg.master1</name>
   <value>master1:50070</value>
</property>
<property>
   <name>dfs.namenode.http-address.zzg.data1</name>
   <value>data1:50070</value>
</property>
<property>
   <name>dfs.namenode.servicerpc-address.zzg.master1</name>
   <value>master1:53310</value>
</property>
<property>
   <name>dfs.namenode.servicerpc-address.zzg.data1</name>
   <value>data1:53310</value>
</property>
<property>
   <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://data1:8485;data2:8485;data3:8485/zzg</value>
</property>
<property>
   <name>dfs.journalnode.edits.dir</name>
   <value>/home/shenl/usr/local/cloud/data/hadoop/ha/journal</value>
</property>
<property>
   <name>dfs.client.failover.proxy.provider.zzg</name>
   <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
   <name>dfs.ha.automatic-failover.enabled</name>
   <value>true</value>
</property>
<property>
       <name>ha.zookeeper.quorum</name>
       <value>data1:2181,data2:2181,data3:2181</value>
</property>
<property>
   <name>dfs.ha.fencing.methods</name>
   <value>sshfence</value>
</property>
<property>
   <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/shenl/.ssh/id_rsa</value>
</property>
 
</configuration>

5.1.2  yarn-site.xml参数配置,注意看HA Support部分.

<configuration>
         <property> 
                   <name>yarn.resourcemanager.address</name> 
                   <value>master1:8032</value> </property> 
         <property> 
                   <name>yarn.resourcemanager.scheduler.address</name> 
                   <value>master1:8030</value> 
         </property> 
         <property> 
                   <name>yarn.resourcemanager.resource-tracker.address</name> 
                   <value>master1:8031</value> 
         </property>
 
         <property>
                   <name>yarn.resourcemanager.admin.address</name> 
                   <value>master1:8033</value> 
         </property> <property> 
                   <name>yarn.resourcemanager.webapp.address</name> 
                   <value>master1:8088</value> 
         </property> 
         <property> 
                   <name>yarn.nodemanager.aux-services</name> 
                  <value>mapreduce_shuffle</value> 
         </property> 
         <property> 
                   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 
                   <value>org.apache.hadoop.mapred.ShuffleHandler</value> 
         </property> 
        
<!-- HA Support -->
 
       <property>
                <name>yarn.resourcemanager.connect.retry-interval.ms</name>
               <value>2000</value>
       </property>
        <property>
               <name>yarn.resourcemanager.ha.enabled</name>
               <value>true</value>
       </property>
       <property>
               <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
               <value>true</value>
       </property>
       <property>
               <name>yarn.resourcemanager.ha.rm-ids</name>
               <value>rm1,rm2</value>
       </property>
        <property>
               <name>yarn.resourcemanager.recovery.enabled</name>
                <value>true</value>
       </property>
       <property>
               <name>yarn.resourcemanager.zk-state-store.address</name>
               <value>data1:2181,data2:2181,data3:2181</value>
       </property>
 
       <property>
               <name>yarn.resourcemanager.store.class</name>
               <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
       </property>
       <property>
               <name>yarn.resourcemanager.zk-address</name>
               <value>data1:2181,data2:2181,data3:2181</value>
       </property>
       <property>
               <name>yarn.resourcemanager.cluster-id</name>
               <value>zzg</value>
       </property>
        <property>
               <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
               <value>5000</value>
        </property>
       <property>
               <name>yarn.resourcemanager.address.rm1</name>
               <value>master1:23140</value>
       </property>
       <property>
               <name>yarn.resourcemanager.scheduler.address.rm1</name>
                <value>master1:23130</value>
       </property>
       <property>
               <name>yarn.resourcemanager.webapp.address.rm1</name>
               <value>master1:23188</value>
       </property>
       <property>
               <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
               <value>master1:23125</value>
       </property>
        <property>
               <name>yarn.resourcemanager.admin.address.rm1</name>
               <value>master1:23141</value>
       </property>
       <property>
               <name>yarn.resourcemanager.ha.admin.address.rm1</name>
               <value>master1:23142</value>
       </property>
        <property>
               <name>yarn.resourcemanager.address.rm2</name>
               <value>data1:23140</value>
       </property>
       <property>
               <name>yarn.resourcemanager.scheduler.address.rm2</name>
               <value>data1:23130</value>
       </property>
       <property>
               <name>yarn.resourcemanager.webapp.address.rm2</name>
               <value>data1:23188</value>
       </property>
       <property>
               <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
               <value>data1:23125</value>
       </property>
       <property>
               <name>yarn.resourcemanager.admin.address.rm2</name>
               <value>data1:23141</value>
       </property>
       <property>
               <name>yarn.resourcemanager.ha.admin.address.rm2</name>
               <value>data1:23142</value>
       </property>
       <property>
               <description>Address where the localizer IPCis.</description>
               <name>yarn.nodemanager.localizer.address</name>
               <value>0.0.0.0:23344</value>
       </property>
        <property>
               <description>NM Webapp address.</description>
               <name>yarn.nodemanager.webapp.address</name>
               <value>0.0.0.0:23999</value>
       </property>
        <property>
               <name>yarn.nodemanager.local-dirs</name>
               <value>/home/shenl/usr/local/cloud/data/hadoop/yarn/local</value>
       </property>
       <property>
               <name>yarn.nodemanager.log-dirs</name>
               <value>/home/shenl/usr/local/cloud/data/logs/hadoop</value>
       </property>
       <property>
               <name>mapreduce.shuffle.port</name>
               <value>23080</value>
       </property>
        <property>
                <name>yarn.client.failover-proxy-provider</name>
                <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
        </property>
        
</configuration>

5.2 Hadoop HA初始化

5.2.1停hbase和hadoop服务

stop-habase.sh

stop-all.sh

5.2.2启动JournalNode服务

hadoop-daemon.sh start journalnode

注: hdfs-site.xml里8485端口对应的节点上执行.

#验证:访问web页面 data2:8480, data3:8480, data4:8480 或则jps查看进程

        

5.2.3 格式化所有JournalNode

hdfs namenode -initializeSharedEdits -force 

注:

1 这里默认master1为主namenode,data1为备namenode,如上命令在master1里执行)

2 这个操作影响的参数和目录为 HDFS-SITE.xml里的dfs.journalnode.edits.dir  参考值为:/home/shenl/data/hadoop/ha/journal

3 这一操作主要完成格式化所有JournalNode,以及将日志文件从master1拷贝到所有JournalNode

5.2.4 在master1里执行ZookeeperHA格式化

hdfs zkfc -formatZK

       

            

5.2.5 拷贝主namenode元数据到备节点内

scp -r home/impala/data/hadoop1.2.1/* hadoop@data1:/home/shenl/data/hadoop1.2.1

注:

拷贝master1节点内的dfs.namenode.name.dir和共享dfs.namenode.shared.edits.dir目录的内容到data1的相应目录内.

5.2.6 在master1里启动namenode

hadoop-daemon.sh start namenode

5.2.7 在data1里启动namenode

hadoop-daemon.sh start namenode

5.2.8 在master1里启动所有的datanode

hadoop-daemons.sh start datanode                                     

注:此时 查看页面master1:35070、data1:35070,两个namenode都是出于standby的状态,因为还未开启选举服务。

5.2.9 在master1和data1内启动自动选举服务

hadoop-daemon.sh start zkfc

   

5.2.10 在master1里执行命令验证HA是否正常

         hdfshaadmin -getServiceState master1

         hdfshaadmin -DFSHAadmin -failover master1 data1

 #或则kill -9 active的namenode,验证standy的namenode是否变为active

5.3 YARN HA验证

5.3.1   配置YARN HA参数

#HA 参数已经在HADOOP HA时配置好

5.3.2   验证

1)    分别在主namenode和备namenode里执行

yarn-daemon.shstart resourcemanager

2)    在主namenode里执行

yarn-daemons.shstart nodemanager

3)    在主namenode里执行

mr-jobhistory-daemon.shstart historyserver

4)    执行Yarn HA状态验证脚本

yarn rmadmin -getServiceState rm1

yarn rmadmin -getServiceState rm2

5)    kill-9 active的nodemanager 测试

6)    验证wordcont MR程序,执行如下命令:

hadoopjar /home/shenl/hadoop-2.5.0-cdh5.3.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.3.jarwordcount /shenl/gc.log /shenl3/

如果成功,视为Yarn HA可用.

6 Hbase HA配置

6.1 Hbase HA配置

6.1.1   配置Hbase HA参数

1)    拷贝配置了Hadoop HA的core-site.xml,hdfs-site.xml到Hmaster节点的conf目录

cp/home/shenl/hadoop-2.5.0-cdh5.3.3/etc/hadoop/core-site.xml .

cp/home/ shenl /hadoop-2.5.0-cdh5.3.3/etc/hadoop/hdfs-site.xml .

2)    Hmaster的conf目录里新增backup-master文件,填写作为备份master的主机名(如data1)

3)    scpHmaster节点的conf的内容到各个从节点

4)    在hbase主节点里执行start-hbase.sh

6.2 Hbase HA验证

kill -9 一个active的Hmaster,在Hbase shell执行

put 'shenl' ,'row11','a:name','hello'

7 Hadoop升级最终化

#集群稳定后,执行最终化以提交本次升级任务.

hadoop dfsadmin –finalizeUpgrade

8 总结

结合日志分析具体问题.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ShenLiang2025

您的鼓励是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值