Hadoop集群安装部署与配置(2015-01-15)

1、集群环境说明

主机列表

wKiom1Y55_zBld8qAAHOWHjqo9U889.jpg

说明:

下文中蓝色部分为实际的执行命令;红色部分是重要的配置信息;”##”后为注释

a. 由于hadoop官网提供的安装包里的库文件默认是32位的,需要自编译为64

 sqoop-1.4.5当前不能支持hadoop-2.5.0,所以也需要手动编译

  需要关闭防火墙和selinux

b. 所需软件的版本:

hbase-0.98.9-hadoop2-bin.tar.gz

hadoop-2.5.0.tar.gz

apache-hive-0.13.1-bin.tar.gz

jdk-7u71-linux-x64.gz

sqoop-1.4.5.tar.gz

zookeeper-3.4.6.tar.gz

c. 集群架构图:

wKiom1Y56L_iPEbHAAJWpn7bEyU305.jpg

2、Hadoop集群环境配置

2.1、软件包的准备和JDK安装

a) 下载JDK和其他安装包 (默认下载的安装包都放在/opt)

# wget http://download.oracle.com/otn-pub/java/jdk/7u71-b14/jdk-7u71-linux-x64.tar.gz 

https://archive.apache.org/dist/hadoop/下去下载hadoop,hbase,hive,pig,zookeeper

# wget http://mirrors.cnnic.cn/apache/sqoop/1.4.5/sqoop-1.4.5.tar.gz

b) 安装JDK

# rpm -qa | grep java   ##如果存在其他版本,使用“rpm-e 包名进行删除 

# mkdir /usr/java

# tar -zxf jdk-8u25-linux-x64.tar.gz -C /usr/java

# vim /etc/profile   ##添加下面的内容

#java path

export JAVA_HOME=/usr/java/jdk1.7.0_71

export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib

export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH:$HOMR/bin

# source /etc/profile

# echo $JAVA_HOME; java -version    ##测试

 

##特别提示:配置java环境需要在每个node上执行一次

2.2、修改hosts文件

# vim /etc/hosts (添加如下内容)

10.40.214.1namenode1

10.40.214.3namenode2

10.40.214.4rmanager

10.40.214.5datanode1

10.40.214.6datanode2

10.40.214.7datanode3

10.40.214.8datanode4

10.40.214.9datanode5

##特别提示:上面的操作需要在每一个node上执行一次

2.3、配置ssh免密码连入

分别在10.40.214.[1,3,4,5,6,7,8,9]上执行下面的命令

# ssh-keygen -t rsa

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.40.214.1

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.40.214.3

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.40.214.4

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.40.214.5

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.40.214.6

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.40.214.7

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.40.214.8

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.40.214.9

# ssh ip    ##测试

 

##注意:这里最好是添加一个hadoop用户,然后切到hadoop用户下(su -hadoop)去执行上面的操作,然后各环境变量都在hadoop用户下.bash_profile中配置

2.4、zookeeper集群安装配置

a) 首先在10.40.214.1上安装zookeeper

# tar -zxf zookeeper-3.4.6.tar.gz -C /var/data/;mvzookeeper-3.4.6 zookeeper

# vim /var/data/zookeeper/conf/ zoo.cfg   ##修改zookeeper主配置文件

dataDir=/var/data/zookeeper/data

server.0=namenode1:2888:3888

server.1=namenode2:2888:3888

server.2=rmanager:2888:3888

# vim /var/data/zookeeper/data/myid   ##新建文件myid,并添加内容为0

0         ##注意: 这个0是和zoo.cfg中配置的server后的数字保持一致

# vim /etc/profile              ##配置zookeeper的环境变量

                  #zookeeper path

export ZOOKEEPER_HOME=/var/data/zookeeper

export PATH=$PATH:$HADOOP_HOME/bin:$ZOOKEEPER_HOME/bin

# source /etc/profile

b) 以上完成后,同步zookeeper到其他节点(最少3个节点且节点数保持为奇数个)

# rsync -av zookeeper namenode2:/var/data/

# rsync -av zookeeper rmanager:/var/data/

# vim /var/data/zookeeper/data/myid   ##按照zoo.cfg中的定义进行相应修改为12

# vim /etc/profile              ##配置zookeeper的环境变量

                  #zookeeper path

export ZOOKEEPER_HOME=/var/data/zookeeper

export PATH=$PATH:$HADOOP_HOME/bin:$ZOOKEEPER_HOME/bin

# source /etc/profile

c) zookeeper启动与停止

启动:# cd /var/data/zookeeper/bin; ./zkServer.shstart   ##需要在集群机器上都执行

停止:#cd /var/data/zookeeper/bin; ./zkServer.sh stop   ##需要在集群机器上都执行

其他可用命令:# ./zkServer.sh --help

./zkServer.sh{start|start-foreground|stop|restart|status|upgrade|print-cmd}

d) 验证

# cd/var/data/zookeeper/bin/; sh zkCli.sh

[zk:localhost:2181(CONNECTED) 0]ls /      ##蓝色加粗部分为用户输入

                  [zookeeper]                                                      ##输出的结果,表示正确

2.5、hadoop集群安装配置

a) Hadoop主要配置文件说明

wKiom1Y56b2TeGRYAANURat-mk8539.jpg

b) Hadoop安装

# tar -zxfhadoop-2.5.0.tar.gz -C /var/data/; mv /var/data/ hadoop-2.5.0 /var/data/hadoop

# vim /etc/profile              ##添加hadoop环境变量(此环境变量需在各节点都添加)

         #hadooppath

exportHADOOP_HOME_WARN_SUPPRESS=1

exportHADOOP_HOME=/var/data/hadoop

HADOOP_CONF_DIR=/var/data/hadoop/etc/hadoop

exportHADOOP_CONF_DIR

HADOOP_LOG_DIR=/var/data/hadoop/logs

exportHADOOP_LOG_DIR

# source /etc/profile

c) 进入/data/hadoop/etc/hadoop,并进行下面的配置

====== 配置开始 ======

# mkdir /var/hadoop/

# vim core-site.xml

<?xml version="1.0"encoding="UTF-8"?>

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

<configuration>

 <property>

    <name>hadoop.tmp.dir</name>

    <value>/var/hadoop/tmp</value>

    <description>A base for other temporarydirectories.</description>

 </property>

 <property>

    <name>dfs.replication</name>

    <value>2</value>

 </property>

 <property>

    <name>fs.trash.interval</name>

    <value>1440</value>

    <description>Number of minutes between trash checkpoints.

      If zero, the trash feature is disabled.

    </description>

 </property>

 <property>

    <name>fs.defaultFS</name>

    <value>hdfs://cluster1</value>

 </property>

 <property>

    <name>ha.zookeeper.quorum</name>

    <value>namenode1:2181,namenode2:2181,rmanager:2181</value>

 </property>

</configuration>

# vim hdfs-site.xml

<?xml version="1.0"encoding="UTF-8"?>

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

<configuration>

  <property>

      <name>dfs.replication</name>

      <value>2</value>

  </property>

  <property>

      <name>dfs.nameservices</name>

      <value>cluster1</value>                               ##这个名字可以随便取,下面的配置会用到

      <description>

          Comma-separated list of nameservice as same as fs.defaultFS incore-site.xml.

      </description>

  </property>

 

  <property>

      <name>dfs.ha.namenodes.cluster1</name>

      <value>namenode1,namenode2</value>   ##配置namenode集群节点

  </property>

 

  <property>

       <name>dfs.namenode.rpc-address.cluster1.namenode1</name>  ##配置namenode1

       <value>namenode1:9000</value>

  </property>

  <property>

       <name>dfs.namenode.http-address.cluster1.namenode1</name>

       <value>namenode1:50070</value>

  </property>

  <property>

       <name>dfs.namenode.servicerpc-address.cluster1.namenode1</name>

       <value>namenode1:53310</value>

  </property>

  <property>

       <name>dfs.namenode.rpc-address.cluster1.namenode2</name>  ##配置namenode2

       <value>namenode2:9000</value>

  </property>

  <property>

       <name>dfs.namenode.http-address.cluster1.namenode2</name>

       <value>namenode2:50070</value>

  </property>

  <property>

       <name>dfs.namenode.servicerpc-address.cluster1.namenode2</name>

       <value>namenode2:53310</value>

  </property>

  <property>

       <name>dfs.ha.automatic-failover.enabled.cluster1</name>##配置集群故障自动切换

       <value>true</value>

       <description>

            Whether automatic failover is enabled.See the HDFS High

           Availability documentation for details on automatic HA

           configuration.

       </description>

  </property>

  <property>

       <name>dfs.client.failover.proxy.provider.cluster1</name>  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

       <description>

           Configure the name of the Java class which will be used

           by the DFS Client to determine which NameNode is the current Active,

           and therefore which NameNode is currently serving client requests.

       </description>

  </property>

  <property>

       <name>dfs.namenode.shared.edits.dir</name>  ##配置journalnode集群

       <value>qjournal://namenode1:8485;namenode2:8485;rmanager:8485/cluster1</value>

       <description>shared recorder for HA to record edits, usualy it isNFS point</description>

  </property>

  <property>

       <name>dfs.journalnode.edits.dir</name>  ##配置journalnode集群的共享目录

       <value>/var/hadoop/qjm</value>

  </property>

  <property>

       <name>dfs.ha.fencing.methods</name>  ##配置集***换进程的通信方式

       <value>sshfence</value>

       <description>how to communicate in the switchprocess</description>

  </property>

   <property>

       <name>dfs.ha.fencing.ssh.private-key-files</name>  ##配置ssh私钥的本地位置

       <value>/root/.ssh/id_rsa</value>

       <description>the location stored ssh key</description>

  </property>

  <property>

       <name>dfs.ha.fencing.ssh.connect-timeout</name>

       <value>1000</value>

  </property>

<!--

  <property>

       <name>dfs.namenode.handler.count</name>

       <value>7</value>

       <description>

           More NameNode server threads to handle RPCs from large number of DataNodes.

       </description>

  </property>

  <property>

       <name>dfs.block.size</name>

       <value>268435456</value>

       <description>The default block size for newfiles</description>

  </property>

  <property>

       <name>dfs.datanode.max.xcievers</name>

       <value>10240</value>

       <description>

           An Hadoop HDFS datanode has an upper bound on the number of files thatit will serve at any one time.

       </description>

  </property>

  <property>

       <name>dfs.datanode.du.reserved</name>

       <value>32212254720</value>

       <description>Reserved space in bytes per volume. Always leave thismuch space free for non dfs use.</description>

  </property>

 -->

</configuration>

# mv mapred-site.xml.templatemapred-site.xml

# vim mapred-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>

<configuration>

  <property>

      <name>mapreduce.framework.name</name>

      <value>yarn</value>

      <description>Execution framework set to HadoopYARN.</description>

  </property>

<!-- MapReduce Applications relatedconfiguration ***BEGIN*** -->

  <property>

      <name>mapred.reduce.tasks</name>

      <value>10</value>

      <description>The default number of reduce tasks perjob.</description>

  </property>

  <property>

      <description>Larger resource limit for maps.</description>

  </property>

  <property>

      <name>mapreduce.map.java.opts</name>

      <value>-Xmx2048M</value>

      <description>Larger heap-size for child jvms ofmaps.</description>

  </property>

  <property>

      <name>mapreduce.reduce.memory.mb</name>

      <value>2048</value>

      <description>Larger resource limit forreduces.</description>

  </property>

  <property>

      <name>mapreduce.reduce.java.opts</name>

      <value>-Xmx2048M</value>

      <description>Larger heap-size for child jvms ofreduces.</description>

  </property>

  <property>

      <name>mapreduce.task.io.sort.mb</name>

       <value>1024</value>

      <description>Higher memory-limit while sorting data forefficiency.</description>

  </property>

  <property>

      <name>mapreduce.task.io.sort.factor</name>

      <value>10</value>

      <description>More streams merged at once while sortingfiles.</description>

  </property>

  <property>

      <name>mapreduce.reduce.shuffle.parallelcopies</name>

      <value>20</value>

      <description>Higher number of parallel copies run by reduces tofetch outputs from very large number of maps</des

cription>

  </property>

  <!-- MapReduce Applications related configuration ***END*** -->

  <!-- MapReduce JobHistory Server related configuration ***BEGIN***-->

  <property>

      <name>mapreduce.jobhistory.address</name>

      <value>rmanager:10020</value>

      <description>MapReduce JobHistory Server host:port.    Default port is 10020.</description>

  </property>

  <property>

      <name>mapreduce.jobhistory.webapp.address</name>

      <value>rmanager:19888</value>

      <description>MapReduce JobHistory Server Web UI host:port. Defaultport is 19888.</description>

  </property>

  <property>

      <name>mapreduce.jobhistory.intermediate-done-dir</name>

      <value>/var/hadoop/mr_history/tmp</value>

      <description>Directory where history files are written byMapReduce jobs.</description>

  </property>

  <property>

      <name>mapreduce.jobhistory.done-dir</name>

      <value>/var/hadoop/mr_history/done</value>

      <description>Directory where history files are managed by the MRJobHistory Server.</description>

 </property>

 <!-- MapReduce JobHistory Server related configuration ***END***-->

</configuration>

 

# vim yarn-site.xml

<?xml version="1.0"?>

<configuration>

   <property>

      <name>yarn.resourcemanager.hostname</name>  ##配置resourcemanager的主机名

      <value>rmanager</value>

      <description>The hostname of the RM.</description>

   </property>    

<property>

     <name>yarn.nodemanager.aux-services</name>

     <value>mapreduce_shuffle</value>

     <description>Shuffle service that needs to be set for

Map Reduce applications.

</description>

  </property>

</configuration>

 

# hadoop-env.sh   ##设置JDK正确的安装路径

export JAVA_HOME=/usr/java/jdk1.8.0_25

 

# yarn-env.sh     ##设置JDK正确的安装路径

export JAVA_HOME=/usr/java/jdk1.8.0_25

 

# vim slaves         ##添加datanode节点

namenode1

namenode2

rmanager

datanode1

datanode2

datanode3

datanode4

datanode5

 

# vim masters

         rmanager

====== 配置结束 =====

2.6、分发hadoop到其余各节点

# rsync -avz -P hadoopnamenode2:/var/data/

# rsync -avz -Phadoop rmanager:/var/data/

# rsync -avz -Phadoop datanode1:/var/data/

# rsync -avz -Phadoop datanode2:/var/data/

# rsync -avz -P hadoopdatanode3:/var/data/

# rsync -avz -Phadoop datanode4:/var/data/

# rsync -avz -Phadoop datanode5:/var/data/

2.7、格式化zookeeper集群

namenode1上执行下面的命令

# /var/data/hadoop/bin/hdfszkfc --formatZK  ##目的是在zookeeper集群上建立HA的相应节点

#/var/data/zookeeper/bin/zkCli.sh

[zk:localhost:2181(CONNECTED) 0] ls /hadoop-ha

         [cluster1]                ##看到该输出,表示ok

 

##格式化操作的目的是在ZK集群中建立一个节点,用于保存集群cluster1NameNode的状态数据

2.8、启动journalnode集群

以下操作分别在namenode1,namenode2,rmanager上执行:

#/var/data/hadoop/sbin/hadoop-daemon.sh start journalnode   ##启动journalnode节点

# jps        ##验证

         10661JournalNode    ##输出结果,表示OK

# ll /var/hadoop/qjm/

总用量 4

drwxr-xr-x 3root root 4096 1   9 01:38 cluster1

 

##启动JournalNode后,会在本地磁盘产生一个目录(/var/hadoop/qjm),用户保存NameNodeedits文件的数据

2.9、格式化集群cluster1的NameNode

namenode1namenode2任意选出一个进行格式化,这里选择namenode1,namenode1上执行下面的命令:

# /var/data/hadoop/bin/hdfs namenode -format -clusterIdcluster1##格式化集群的namenode

# ll/var/hadoop/tmp/dfs/          ##验证    

总用量 8

drwx------ 3 root root 4096 1   9 01:42 data

drwxr-xr-x 3 root root 4096 1   9 01:38 name

 

##格式化NameNode会在磁盘产生一个目录(/var/hadoop/tmp/dfs/name),用于保存NameNodefsp_w_picpathedits等文件

2.10、启动NameNode

namenode1上启动2.9中格式化的namenode

# /var/data/hadoop/sbin/hadoop-daemon.sh start namenode  ##启动namenode

# jps     ##验证

10782 NameNode

 

## web方式验证:http://10.40.214.1:50070/

2.11、同步namenode到集群中其他namenode节点

namenode2上执行namenode同步命令并启动namenode

# /var/data/hadoop/bin/hdfsnamenode -bootstrapstandby  ##同步namenode

# ll/var/hadoop/tmp/dfs/   ##验证

总用量 8

drwx------ 3 root root 4096 1   9 01:42 data

drwxr-xr-x 3 root root 4096 1   9 01:39 name

# /var/data/hadoop/sbin/hadoop-daemon.shstart namenode  ##启动namenode

# jps     ##验证

         10774 NameNode

 

## web方式验证:http://10.40.214.3:50070/

2.12、启动所有的datanode

namenode1上执行命令:

# /var/data/hadoop/sbin/hadoop-daemons.sh startdatanode  ##启动datanode

# jps     ##验证

11075 DataNode

2.13、启动yarn

rmanager上执行命令:

# /var/data/hadoop/sbin/start-yarn.sh    ##启动yarn

# jps     ##验证,产生java进程ResourceManagerNodeManager

         22232 ResourceManager

22337 NodeManager          ##注意:其他节点只启动NodeManager 

2.14、启动ZookeeperFailoverController

namenode1namenode2上执行下面的命令:

# /var/data/hadoop/sbin/hadoop-daemon.sh start zkfc  ##启动

# jps      ##验证,产生java进程DFSZKFailoverController

11316DFSZKFailoverController

2.15、Hadoop服务启动与停止

启动服务

# cd /var/data/hadoop

# sbin/start-dfs.sh    ##namenode1上执行

# sbin/start-yarn.sh   ##rmanager上执行

 

停止服务

# cd /var/data/hadoop

# sbin/stop-dfs.sh   ##namenode1上执行

# sbin/stop-yarn.sh  ##rmanager上执行

2.16、验证

查看启动的进程

# jps

 

通过浏览器访问

NameNode工作情况:http://10.40.214.1:50070/ http://10.40.214.3:50070/

Yarn resourceManagerhttp://10.40.214.4:8088/  ##yarn-site.xml中定义的