幽灵工作室提供
问题去新浪博客留言:http://weibo.com/youlingR
1.节点准备
三个节点:
master 192.168.1.150
namenode,resourcemanager,datanode,nodemanager,zookeeper,journalnode,dfszkfailovercontroller
slave1 192.168.1.151
namenode,datanode,nodemanager,zookeeper,journalnode,dfszkfailovercontroller
slave2 192.168.1.152
datanode,nodemanager,zookeeper,journalnode
2.基本配置
主机名和ip地址的配置
安装jdk配置环境变量
export JAVA_HOME=/usr/java/jdk1.7.0
export PATH=$PATH:$JAVA_HOME/bin
source /etc/profile
配置ssh免密码连接
一共三块:
ResourceManager 启动NodeManager的配置SSH
NameNode 启动 DataNode的配置 SSH
NameNode之间的启动配置 SSH
ResourceManager 启动NodeManager的配置SSH
NameNode 启动 DataNode的配置 SSH
cd $HOME/.ssh/
ssh-keygen -t rsa //生成密钥
ssh-copy-id -i ~/.ssh/id_rsa.pub master
ssh-copy-id -i ~/.ssh/id_rsa.pub slave1 #拷贝密钥
ssh-copy-id -i ~/.ssh/id_rsa.pub slave2
ssh hadoop@slave1 #测试
NameNode之间的启动配置 SSH
cd $HOME/.ssh/
ssh-keygen -t rsa //生成密钥
ssh-copy-id -i ~/.ssh/id_rsa.pub master
配置host文件
sudo vim /etc/hosts(需要给hadoop添加sudo的权限)
192.168.1.150 master
192.168.1.151 slave1
192.168.1.152 slave2
scp /etc/hosts root@slave1:/etc/hosts
3.zk安装
基本安装:
wget http://apache.claz.org/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz #下载zk
tar -zxvf zookeeper-3.4.5.tar.gz -C /cloud/
sudo midir /cloud/
sudo chown hadoop:hadoop /cloud/
zk配置:
cp zoo_sample.cfg zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/cloud/zookeeper-3.4.5/data
# the port at which the clients will connect
clientPort=2181
server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888
拷贝
scp -r /cloud/zookeeper-3.4.5/ hadoop@slave1:/cloud/
scp -r /cloud/zookeeper-3.4.5/ hadoop@slave2:/cloud/
设置
配置zk的id
echo "1" > /cloud/zookeeper-3.4.5/myid
echo "2" > /cloud/zookeeper-3.4.5/myid
echo "3" > /cloud/zookeeper-3.4.5/myid
启动
ZK_HOME=/cloud/zookeeper-3.4.5/
$ZK_HOME/bin/zkServer.sh start #每个节点启动
jps 查看进程,每个节点QuorumPeerMain
bin/zkServer.sh status #一个header 多个follower
测试Zookeeper是否安装成功
[hadoop@master zookeeper-3.4.5]$ bin/zkCli.sh
Connecting to localhost:2181
2014-06-10 01:37:16,763 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2014-06-10 01:37:16,774 [myid:] - INFO [main:Environment@100] - Client environment:host.name=master
2014-06-10 01:37:16,775 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.7.0
2014-06-10 01:37:16,775 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2014-06-10 01:37:16,776 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.7.0/jre
2014-06-10 01:37:16,776 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/cloud/zookeeper-3.4.5/bin/../build/classes:/cloud/zookeeper-3.4.5/bin/../build/lib/*.jar:/cloud/zookeeper-3.4.5/bin/../lib/slf4j-log4j12-1.6.1.jar:/cloud/zookeeper-3.4.5/bin/../lib/slf4j-api-1.6.1.jar:/cloud/zookeeper-3.4.5/bin/../lib/netty-3.2.2.Final.jar:/cloud/zookeeper-3.4.5/bin/../lib/log4j-1.2.15.jar:/cloud/zookeeper-3.4.5/bin/../lib/jline-0.9.94.jar:/cloud/zookeeper-3.4.5/bin/../zookeeper-3.4.5.jar:/cloud/zookeeper-3.4.5/bin/../src/java/lib/*.jar:/cloud/zookeeper-3.4.5/bin/../conf:
2014-06-10 01:37:16,781 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib
2014-06-10 01:37:16,782 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2014-06-10 01:37:16,782 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2014-06-10 01:37:16,782 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux
2014-06-10 01:37:16,784 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=i386
2014-06-10 01:37:16,784 [myid:] - INFO [main:Environment@100] - Client environment:os.version=2.6.32-358.el6.i686
2014-06-10 01:37:16,785 [myid:] - INFO [main:Environment@100] - Client environment:user.name=hadoop
2014-06-10 01:37:16,785 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/home/hadoop
2014-06-10 01:37:16,786 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/cloud/zookeeper-3.4.5
2014-06-10 01:37:16,791 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@149f041
Welcome to ZooKeeper!
2014-06-10 01:37:16,898 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@966] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2014-06-10 01:37:16,911 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@849] - Socket connection established to localhost/127.0.0.1:2181, initiating session
JLine support is enabled
2014-06-10 01:37:16,992 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1207] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x14684e16e110000, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 1] create /rolin rolin
Created /rolin
[zk: localhost:2181(CONNECTED) 2] ls
ZooKeeper -server host:port cmd args
connect host:port
get path [watch]
ls path [watch]
set path data [version]
rmr path
delquota [-n|-b] path
quit
printwatches on|off
create [-s] [-e] path data acl
stat path [watch]
close
ls2 path [watch]
history
listquota path
setAcl path acl
getAcl path
sync path
redo cmdno
addauth scheme auth
delete path [version]
setquota -n|-b val path
[zk: localhost:2181(CONNECTED) 3] ls /
[rolin, zookeeper]
[zk: localhost:2181(CONNECTED) 4] get /rolin
rolin
cZxid = 0x100000002
ctime = Tue Jun 10 01:37:50 PDT 2014
mZxid = 0x100000002
mtime = Tue Jun 10 01:37:50 PDT 2014
pZxid = 0x100000002
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0
[zk: localhost:2181(CONNECTED) 5] set /rolin my name is rolin
Command failed: java.lang.NumberFormatException: For input string: "name"
[zk: localhost:2181(CONNECTED) 6] set /rolin youling
cZxid = 0x100000002
ctime = Tue Jun 10 01:37:50 PDT 2014
mZxid = 0x100000003
mtime = Tue Jun 10 01:38:45 PDT 2014
pZxid = 0x100000002
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 0
[zk: localhost:2181(CONNECTED) 7] get /rolin
youling
cZxid = 0x100000002
ctime = Tue Jun 10 01:37:50 PDT 2014
mZxid = 0x100000003
mtime = Tue Jun 10 01:38:45 PDT 2014
pZxid = 0x100000002
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 0
[zk: localhost:2181(CONNECTED) 8] delete /rolin
[zk: localhost:2181(CONNECTED) 9] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 10] quit
Quitting...
2014-06-10 01:39:11,351 [myid:] - INFO [main:ZooKeeper@684] - Session: 0x14684e16e110000 closed
2014-06-10 01:39:11,351 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@509] - EventThread shut down
4.hadoop安装
4.1解压hadoop到/cloud下
tar -zxvf ~/Downloads/hadoop-2.2.0.tar.gz -C /cloud/
4.2配置hadoop配置文件
这里要修改配置文件一共包括 6 个,分别是 在hadoop-env.sh 、 core-site.xml 、 hdfs-site.xml 、 mapred-site.xml 、 yarn-site.xml 和 slaves 。
修改文件的目录地址:/home/tom/yarn/hadoop-2.2.0/etc/hadoop/
4.2.1文件 hadoop-env.sh
添加jdk 环境变量:
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45
4.2.2 文件 coer-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster1</value>
</property>
【这里的值指的是默认的 HDFS 路径 。这里只有一个HDFS 集群,在这里指定!该值来自于 hdfs-site.xml 中的配置】
<property>
<name>hadoop.tmp.dir</name>
<value>/cloud/hadoop-2.2.0/data</value>
</property>
【这里的路径默认是 NameNode 、 DataNode 、 JournalNode 等存放数据的公共目录。用户也可以自己单独指定这三类节点的目录。 这里的yarn_data/tmp 目录与文件都是自己创建的 】
<property>
<name>ha.zookeeper.quorum</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
【这里是 ZooKeeper 集群的地址和端口。注意,数量一定是奇数,且不少于三个节点】
</configuration>
4.2.3 文件 hdfs-site.xml
重点核心文件:
<configuration>
<property>
<name>dfs.nameservices</name>
<value>cluster1</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>nn1,nn2</value>
</property>
<!-- nn1的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.cluster1.nn1</name>
<value>master1:9000</value>
</property>
<!-- nn1的http通信地址 -->
<property>
<name>dfs.namenode.http-address.cluster1.nn1</name>
<value>master1:50070</value>
</property>
<!-- nn2的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.cluster1.nn2</name>
<value>slave1:9000</value>
</property>
<!-- nn2的http通信地址 -->
<property>
<name>dfs.namenode.http-address.cluster1.nn2</name>
<value>slave1:50070</value>
</property>
<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master1:8485;slave1:8485;slave2:8485/cluster1</value>
</property>
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/cloud/hadoop-2.2.0/journal</value>
</property>
<!-- 开启NameNode失败自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.cluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<!-- 使用sshfence隔离机制时需要ssh免登陆 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<!-- 配置sshfence隔离机制超时时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
2.4 文件 mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
【指定运行 mapreduce 的环境是 yarn ,与 hadoop1 不同的地方】
2.5 文件 yarn-site.xml
<configuration>
<!-- 指定resourcemanager地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<!-- 指定nodemanager启动时加载server的方式为shuffle server -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
2.6 文件 slaves
添加:这里指定哪台机器是 datanode ,这里指定 4 台机器。你甚至可以把集群所有机器都当做 datanode
master
slave1
slave2
4.3配置/etc/profile文件
配置一下hadoop-home
4.4拷贝hadoop到各个节点
scp -r /cloud/hadoop-2.2.0 hadoop@slave1:/cloud/
scp -r /cloud/hadoop-2.2.0 hadoop@slave2:/cloud/
4.5启动集群
4.5.1启动zookeeper集群(分别在master、slave1、slave2上启动zk)
cd /cloud/zookeeper-3.4.5/bin/
./zkServer.sh start
#查看状态:一个leader,两个follower
./zkServer.sh status
4.5.2启动journalnode(在itcast01上启动所有journalnode,注意:是调用的hadoop-daemons.sh这个脚本,注意是复数s的那个脚本)
cd /cloud/hadoop-2.2.0
sbin/hadoop-daemons.sh start journalnode
#运行jps命令检验,itcast04、itcast05、itcast06上多了JournalNode进程
4.5.3格式化HDFS
#在master上执行命令:
hadoop namenode -format
#格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件,这里我配置的是/cloud/hadoop-2.2.0/data,然后将/cloud/hadoop-2.2.0/data拷贝到slave1的/cloud/hadoop-2.2.0/下。
scp -r tmp/ slave1:/cloud/hadoop-2.2.0/
4.5.4格式化ZK(在itcast01上执行即可)
hdfs zkfc -formatZK
4.5.5启动HDFS(在itcast01上执行)
sbin/start-dfs.sh
4.5.6启动YARN(#####注意#####:是在itcast03上执行start-yarn.sh,把namenode和resourcemanager分开是因为性能问题,因为他们都要占用大量资源,所以把他们分开了,他们分开了就要分别在不同的机器上启动)
sbin/start-yarn.sh