Hadoop2.2.0 HA高可用分布式集群搭建(hbase,hive,sqoop,spark)


1 需要软件

Hadoop-2.2.0

Hbase-0.96.2(这里就用这个版本,跟Hadoop-2.2.0是配套的,不用覆盖jar包什么的)

Hive-0.13.1

Zookeepr-3.4.6(建议使用Zookeepr-3.4.5,这样就不用替换storm和hive里面的zookeepr-3.4.5.jar了)

Sqoop1.4.5

Scala-2.10.4

Spark-1.0.2-bin-hadoop2

Jdk1.7.0_51

2 集群结构图

NN : NameNode

JN : JournalNode

DN : DataNode

ZK : ZooKeeper

HM:HMster

HRS:HregionServer

SpkMS:Spark Master

SpkWK:Spark worker




3 Zookeeper-3.4.6 

添加环境变量

##set zookeepr enviroment

export ZOOKEEPER_HOME=/home/cloud/zookeeper346

export PATH=$PATH:$ZOOKEEPER_HOME/bin

3.1 zoo.cfg 配置文件的修改

cloud@hadoop37:~/zookeeper346/conf> ls

configuration.xsl log4j.properties zookeeper.out  zoo_sample.cfg

cloud@hadoop37:~/zookeeper346/conf> cpzoo_sample.cfg zoo.cfg

cloud@hadoop37:~/zookeeper346/conf> vi zoo.cfg

 # The number of milliseconds of each tick

tickTime=2000

# The number of ticks that the initial

# synchronization phase can take

initLimit=10

# The number of ticks that can pass between

# sending a request and getting an acknowledgement

syncLimit=5

# the directory where the snapshot is stored.

# do not use /tmp for storage, /tmp here is just

# example sakes.

dataDir=/home/cloud/zookeeper346/zkdata

dataLogDir=/home/cloud/zookeeper346/logs

# the port at which the clients will connect

clientPort=2181

# the maximum number of client connections.

# increase this if you need to handle more clients

#maxClientCnxns=60

#

# Be sure to read the maintenance section of the

# administrator guide before turning on autopurge.

#

#http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance

#

# The number of snapshots to retain in dataDir

#autopurge.snapRetainCount=3

# Purge task interval in hours

# Set to "0" to disable auto purge feature

#autopurge.purgeInterval=1

 

server.1=hadoop37:2888:3888

server.2=hadoop38:2888:3888

server.3=hadoop40:2888:3888

server.4=hadoop41:2888:3888

server.5=hadoop42:2888:3888

server.6=hadoop43:2888:3888

server.7=hadoop44:2888:3888

3.2 dataDir目录下创建 myid文件

cloud@hadoop37:~/zookeeper346/zkdata> vi myid

cloud@hadoop37:~/zookeeper346/zkdata> ll

total 12

-rw-r--r-- 1 cloud hadoop    2May 28 18:54 myid

cloud@hadoop37:~/zookeeper346/zkdata>

3.3 复制(SCP)到其它的服务器下去

cloud@hadoop37:~ > scp -r/home/cloud/zookeeper346 cloud@hadoop38:~/

cloud@hadoop37:~ > scp -r/home/cloud/zookeeper346 cloud@hadoop40:~/

cloud@hadoop37:~ > scp -r/home/cloud/zookeeper346 cloud@hadoop41:~/

cloud@hadoop37:~ > scp -r/home/cloud/zookeeper346 cloud@hadoop42:~/

cloud@hadoop37:~ > scp -r/home/cloud/zookeeper346 cloud@hadoop43:~/

cloud@hadoop37:~ > scp -r/home/cloud/zookeeper346 cloud@hadoop44:~/

然后只要修改…data/myid文件成对应的id就好了

hadoop37中写入 1,

hadoop38中写入 2,

以此类推 …

4 Hadoop-2.2.0

添加环境变量

##set hadoop enviroment

export HADOOP_HOME=/home/cloud/hadoop220

export YARN_HOME=/home/cloud/hadoop220

export HADOOP_MAPARED_HOME=${HADOOP_HOME}

export HADOOP_COMMON_HOME=${HADOOP_HOME}

export HADOOP_HDFS_HOME=${HADOOP_HOME}

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop

export HADOOP_PREFIX=${HADOOP_HOME}

exportHADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native

exportHADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib"

export PATH=$PATH:$HADOOP_HOME/bin

4.1 修改7个配置文件

~/hadoop220/etc/hadoop/hadoop-env.sh

~/ hadoop220/etc/hadoop/core-site.xml

~/ hadoop220/etc/hadoop/hdfs-site.xml

~/ hadoop220/etc/hadoop/mapred-site.xml

~/ hadoop220/etc/hadoop/yarn-env.sh

~/ hadoop220/etc/hadoop/yarn-site.xml

~/ hadoop220/etc/hadoop/slaves

4.1.1修改hadoop-env.sh配置文件(jdk 路径)

cloud@hadoop59:~/hadoop220/etc/hadoop> pwd

/home/cloud/hadoop220/etc/hadoop

[root@masterhadoop]# vi hadoop-env.sh

 …

# Set Hadoop-specific environment variables here.

 

# The only required environment variable is JAVA_HOME.  All others are

# optional.  When running adistributed configuration it is best to

# set JAVA_HOME in this file, so that it is correctly defined on

# remote nodes.

 

# The java implementation to use.

export JAVA_HOME=/usr/java/jdk1.7.0_51

4.1.2修改core-site.xml文件修改 (注意fs.defaultFS的配置)

 cloud@hadoop59:~/hadoop220/etc/hadoop>  vi core-site.xml

 

<configuration>

    <property>

       <name>fs.defaultFS</name>

       <value>hdfs://mycluster</value>

    </property>

   

    <property>

               <name>hadoop.tmp.dir</name>

                <value>file:/home/cloud/hadoop220/temp</value>

        </property>

   

    <property>

               <name>dfs.nameservices</name>

                <value>mycluster</value>

        </property>

 

    <property>

               <name>ha.zookeeper.quorum</name>

                <value>hadoop37:2181,hadoop38:2181,hadoop40:2181,hadoop41:2181,hadoop42:2181,hadoop43:2181,hadoop44:2181</value>

        </property>

 

</configuration>

4.1.3修改hdfs-site.xml配置文件

cloud@hadoop59:~/hadoop220/etc/hadoop> vihdfs-site.xml

 

<configuration>

    <property>

       <name>dfs.nameservices</name>

       <value>mycluster</value>

    </property>

 

    <property>

       <name>dfs.ha.namenodes.mycluster</name>

       <value>hadoop59,hadoop60</value>

    </property>

 

    <property>

        <name>dfs.namenode.rpc-address.mycluster.hadoop59</name>

       <value>hadoop59:8020</value>

    </property>

 

    <property>

       <name>dfs.namenode.rpc-address.mycluster.hadoop60</name>

       <value>hadoop60:8020</value>

    </property>

 

    <property>

       <name>dfs.namenode.http-address.mycluster.hadoop59</name>

       <value>hadoop59:50070</value>

    </property>

 

    <property>

       <name>dfs.namenode.http-address.mycluster.hadoop60</name>

       <value>hadoop60:50070</value>

    </property>

 

    <property>

       <name>dfs.namenode.shared.edits.dir</name>

       <value>qjournal://hadoop26:8485;hadoop27:8485;hadoop28:8485/mycluster</value>

    </property>

   

     <property>

               <name>dfs.namenode.name.dir</name>

                <value>file:///home/cloud/hadoop220/dfs/name</value>

        </property>

 

    <property>

               <name>dfs.datanode.data.dir</name>

                <value>file:///data1,/data2,/data3</value>

        </property>

###这里暂时只用了data1~data3,后面如果需要填加data盘的话,只需###要修改配置文件,并且重启集群即可。

    <property>

       <name>dfs.ha.automatic-failover.enabled.mycluster</name>

       <value>true</value>

    </property>

 

    <property>

       <name>dfs.client.failover.proxy.provider.mycluster</name>

    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

    </property>

 

    <property>

       <name>dfs.ha.fencing.methods</name>

       <value>sshfence</value>

    </property>

 

    <property>

       <name>dfs.ha.fencing.ssh.private-key-files</name>

       <value>/home/cloud/.ssh/id_rsa</value>

    </property>

   

    <property>

       <name>dfs.journalnode.edits.dir</name>

       <value>/home/cloud/hadoop220/tmp/journal</value>

    </property>

   

    <property>

             <name>dfs.replication</name>

             <value>3</value>

       </property>

 

       <property>

             <name>dfs.webhdfs.enabled</name>

             <value>true</value>

       </property>      

 

</configuration>

4.1.4修改 mapred­-site.xml配置文件

cloud@hadoop59:~/hadoop220/etc/hadoop> cpmapred-site.xml.template mapred-site.xml

cloud@hadoop59:~/hadoop220/etc/hadoop> vi mapred-site.xml

 

<configuration>

       <property>

             <name>mapreduce.framework.name</name>

             <value>yarn</value>

       </property>

</configuration>

4.1.5修改yarn-env.sh配置文件

cloud@hadoop59:~/hadoop220/etc/hadoop> viyarn-env.sh

 # some Java parameters

export JAVA_HOME=/usr/java/jdk1.7.0_51

4.1.6修改yarn-site.xml配置文件

cloud@hadoop59:~/hadoop220/etc/hadoop> viyarn-site.xml

 

<configuration>

<!-- Site specific YARN configuration properties-->

       <property>

             <name>yarn.nodemanager.aux-services</name>

             <value>mapreduce_shuffle</value>

       </property>

       <property>

             <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

             <value>org.apache.hadoop.mapred.ShuffleHandler</value>

       </property>

       <property>

             <name>yarn.resourcemanager.hostname</name>

             <value>hadooo59</value>

       </property>

</configuration>

4.1.7修改slaves配置文件

cloud@hadoop59:~/hadoop220/etc/hadoop> vislaves

 

hadoop26

hadoop27

hadoop28

hadoop29

hadoop36

hadoop37

hadoop38

hadoop40

hadoop41

hadoop42

hadoop43

hadoop44

5 Hadoop配置结束,开始启动各个程序(笔记只保留重要日志信息)

5.1 在每个节点上启动Zookeeper

cloud@hadoop37:~/zookeeper346> pwd

/home/cloud/zookeeper346

cloud@hadoop37:~/zookeeper346>bin/zkServer.sh start

JMX enabled by default

Using config: /home/cloud/zookeeper346/bin/../conf/zoo.cfg

Starting zookeeper ... STARTED

其它服务器也这样启动,这里就不写了…

# 验证Zookeeper是否启动成功1

在hadoop41上查看zookeeper的状态发现是leader

cloud@hadoop41:~> zkServer.sh status

JMX enabled by default

Using config: /home/cloud/zookeeper346/bin/../conf/zoo.cfg

Mode: leader

cloud@hadoop41:~>

在其他的机器上查看zookeeper的状态发现是follower

#验证Zookeeper是否启动成功2

cloud@hadoop41:~> zookeeper346/bin/zkCli.sh

Connecting to localhost:2181

2015-06-01 15:50:50,888 [myid:] - INFO  [main:Environment@100] - Clientenvironment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT

2015-06-01 15:50:50,895 [myid:] - INFO  [main:Environment@100] - Clientenvironment:host.name=hadoop41

2015-06-01 15:50:50,895 [myid:] - INFO  [main:Environment@100] - Clientenvironment:java.version=1.7.0_51

2015-06-01 15:50:50,900 [myid:] - INFO  [main:Environment@100] - Clientenvironment:java.vendor=Oracle Corporation

2015-06-01 15:50:50,900 [myid:] - INFO  [main:Environment@100] - Clientenvironment:java.home=/usr/java/jdk1.7.0_51/jre

2015-06-01 15:50:50,900 [myid:] - INFO  [main:Environment@100] - Clientenvironment:java.class.path=/home/cloud/zookeeper346/bin/../build/classes:/home/cloud/zookeeper346/bin/../build/lib/*.jar:/home/cloud/zookeeper346/bin/../lib/slf4j-log4j12-1.6.1.jar:/home/cloud/zookeeper346/bin/../lib/slf4j-api-1.6.1.jar:/home/cloud/zookeeper346/bin/../lib/netty-3.7.0.Final.jar:/home/cloud/zookeeper346/bin/../lib/log4j-1.2.16.jar:/home/cloud/zookeeper346/bin/../lib/jline-0.9.94.jar:/home/cloud/zookeeper346/bin/../zookeeper-3.4.6.jar:/home/cloud/zookeeper346/bin/../src/java/lib/*.jar:/home/cloud/zookeeper346/bin/../conf::/usr/java/jdk1.7.0_51/lib:/usr/java/jdk1.7.0_51/jre/lib

2015-06-01 15:50:50,901 [myid:] - INFO  [main:Environment@100] - Clientenvironment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib

2015-06-01 15:50:50,901 [myid:] - INFO  [main:Environment@100] - Clientenvironment:java.io.tmpdir=/tmp

2015-06-01 15:50:50,901 [myid:] - INFO  [main:Environment@100] - Clientenvironment:java.compiler=<NA>

2015-06-01 15:50:50,901 [myid:] - INFO  [main:Environment@100] - Clientenvironment:os.name=Linux

2015-06-01 15:50:50,902 [myid:] - INFO  [main:Environment@100] - Clientenvironment:os.arch=amd64

2015-06-01 15:50:50,902 [myid:] - INFO  [main:Environment@100] - Clientenvironment:os.version=3.0.13-0.27-default

2015-06-01 15:50:50,902 [myid:] - INFO  [main:Environment@100] - Clientenvironment:user.name=cloud

2015-06-01 15:50:50,902 [myid:] - INFO  [main:Environment@100] - Clientenvironment:user.home=/home/cloud

2015-06-01 15:50:50,903 [myid:] - INFO  [main:Environment@100] - Clientenvironment:user.dir=/home/cloud

2015-06-01 15:50:50,906 [myid:] - INFO  [main:ZooKeeper@438] - Initiating clientconnection, connectString=localhost:2181 sessionTimeout=30000watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@75e5d16d

Welcome to ZooKeeper!

2015-06-01 15:50:50,959 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@975]- Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will notattempt to authenticate using SASL (unknown error)

2015-06-01 15:50:50,969 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@852] - Socketconnection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session

JLine support is enabled

2015-06-01 15:50:51,009 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1235] - Sessionestablishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid =0x44d9d846f630001, negotiated timeout = 30000

 

WATCHER::

 

WatchedEvent state:SyncConnected type:None path:null

[zk: localhost:2181(CONNECTED) 0]

[zk: localhost:2181(CONNECTED) 1] ls /

[zookeeper]

[zk: localhost:2181(CONNECTED) 2]

出现这样的提示的话,那么zookeeper就启动成功了

5.2 在hadoop59上格式化Zookeeper(这里一定要在namenode配置的机器上执行,否则会报错,是一个bug)

Bug详情见https://issues.apache.org/jira/browse/HDFS-6731

cloud@hadoop59:~/hadoop220/bin >hdfs zkfc-formatZK

/cloud/hadoop220/contrib/capacity-scheduler/*.jar

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/cloud/hadoop220/lib/native

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Clientenvironment:java.io.tmpdir=/tmp

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Clientenvironment:java.compiler=<NA>

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Clientenvironment:os.name=Linux

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Clientenvironment:os.arch=amd64

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Clientenvironment:os.version=3.0.76-0.11-default

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Clientenvironment:user.name=cloud

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Clientenvironment:user.home=/home/cloud

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Clientenvironment:user.dir=/home/cloud/hadoop220/bin

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Initiating clientconnection,connectString=hadoop37:2181,hadoop38:2181,hadoop40:2181,hadoop41:2181,hadoop42:2181,hadoop43:2181,hadoop44:2181sessionTimeout=5000watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@58d48756

15/06/01 09:42:20 INFO zookeeper.ClientCnxn: Opening socketconnection to server hadoop37/192.168.100.37:2181. Will not attempt toauthenticate using SASL (unknown error)

15/06/01 09:42:20 INFO zookeeper.ClientCnxn: Socket connectionestablished to hadoop37/192.168.100.37:2181, initiating session

15/06/01 09:42:20 INFO zookeeper.ClientCnxn: Session establishmentcomplete on server hadoop37/192.168.100.37:2181, sessionid = 0x14d9d846f810001,negotiated timeout = 5000

15/06/01 09:42:20 INFO ha.ActiveStandbyElector: Session connected.

15/06/01 09:42:20 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK.

15/06/01 09:42:20 INFO zookeeper.ZooKeeper: Session:0x14d9d846f810001 closed

15/06/01 09:42:20 INFO zookeeper.ClientCnxn: EventThread shut down

5.3 验证zkfc是否格式化成功

cloud@hadoop59:~/hadoop220/bin > pwd

/home/cloud/hadoop220/bin

cloud@hadoop41:~> zookeeper346/bin/zkCli.sh

[zk: localhost:2181(CONNECTED) 0] ls /

[hadoop-ha, zookeeper]

[zk: localhost:2181(CONNECTED) 2] ls /hadoop-ha

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值