Hadoop cluster安装部署

Hadoop的安装


硬件环境
DELL R710
Vmware ESXI 5.0
虚拟机
    system:CentOS6.4_64(Base Server)
    硬盘:40G
    内存:2G    
    CPU: 2 X 2
    网卡:1000MB X 1
配置规划
hosts
188.188.3.241 Hadoop1
188.188.3.242 Hadoop2
188.188.3.243 Hadoop3
188.188.3.244 Hadoop4
188.188.3.245 Hadoop5
188.188.3.246 Hadoop6
188.188.3.247 Hadoop7

hadoop集群结构介绍
整个集群分未存储集群和运算集群两种
存储集群(HDFS) = Namenode(主) + Datanode(节点)    #Hadoop1为Namenode
计算集群(MapReduce) = Jobtracker(主) + Tasktracker(节点) #Hadoop1为Jobtracker

部署规划
a.安装依赖环境
b.安装目录及权限规划配置
c.配置ssh互信
d.安装目录及用户规划配置
e.配置和启动
f.查看与测试

下边进入正式配置
获取包并安装依赖环境(any node)
wget http://mirror.bit.edu.cn/apache/hadoop/core/hadoop-1.2.0/hadoop-1.2.0.tar.gz
自从oracle娶了sun后,就得去oracle下载jdk了一下是64位的RPM包,如果需要别的自行去http://www.oracle.com下载
http://download.oracle.com/otn-pub/java/jdk/7u25-b15/jdk-7u25-linux-x64.rpm?AuthParam=1373339631_ad64091e4bb2f5d02d6a7e3aa3392831

[root@Hadoop1 src]# cd /usr/src
[root@Hadoop1 src]# tar -zxvf hadoop-1.2.0.tar.gz -C /usr/local && mv /usr/local/hadoop-1.2.0 /usr/local/hadoop
[root@Hadoop1 src]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
188.188.3.241 Hadoop1
188.188.3.242 Hadoop2
188.188.3.243 Hadoop3
188.188.3.244 Hadoop4
188.188.3.245 Hadoop5
188.188.3.246 Hadoop6
188.188.3.247 Hadoop7
[root@Hadoop1 src]# echo "export JAVA_HOME=/usr/java/jdk1.7.0_25" >>/root/.bash_profile && echo "export HADOOP_HOME=/usr/local/hadoop" >> /root/.bash_profile && export JAVA_HOME=/usr/java/jdk1.7.0_25 && export HADOOP_HOME=/usr/local/hadoop    #配置JAVA和HADOOP的家目录
[root@Hadoop1 src]# echo "export PATH=$PATH:/usr/local/hadoop/bin:/usr/local/hadoop/sbin" >>~/.bash_profile && export PATH=$PATH:/usr/local/hadoop/bin:/usr/local/hadoop/sbin #配置环境变量
[root@Hadoop1 src]# mkdir ~/.ssh && chmod -R 700 ~/.ssh
[root@Hadoop1 src]# mkdir -p /opt/hadoop /opt/hadoop/log/ /opt/hadoop/hdfs/namenode /opt/hadoop/hdfs/datanode /opt/hadoop/mapred /opt/hadoop/pidfile
[root@Hadoop1 src]# chmod -R 700 /opt/hadoop/hdfs/datanode
[root@Hadoop1 src]# cat /usr/local/hadoop/conf/masters #此文件修改成如下内容
Hadoop1
[root@Hadoop1 src]# cat /usr/local/hadoop/conf/slaves #此文件修改成如下内容
Hadoop2
Hadoop3
Hadoop4
Hadoop5
Hadoop6
Hadoop7
[root@Hadoop1 src]# hadoop-setup-conf.sh
Setup Hadoop Configuration

Where would you like to put config directory? (/etc/hadoop) /usr/local/hadoop/conf
Where would you like to put log directory? (/var/log/hadoop) /opt/hadoop/log
Where would you like to put pid directory? (/var/log/hadoop) /opt/hadoop/pidfile
What is the host of the namenode? (localhost.localdomain) Hadoop1
Where would you like to put namenode data directory? (/var/lib/hadoop/hdfs/namenode) /opt/hadoop/hdfs/namenode
Where would you like to put datanode data directory? (/var/lib/hadoop/hdfs/datanode) /opt/hadoop/hdfs/datanode
What is the host of the jobtracker? (localhost.localdomain) Hadoop1
Where would you like to put jobtracker/tasktracker data directory? (/var/lib/hadoop/mapred) /opt/hadoop/mapred
Where is JAVA_HOME directory? (/usr/java/jdk1.7.0_25)
Would you like to create directories/copy conf files to localhost? (Y/n) Y

Review your choices:

Config directory            : /usr/local/hadoop/conf
Log directory               : /opt/hadoop/log
PID directory               : /opt/hadoop/pidfile
Namenode host               : Hadoop1    #此处Hadoop1为Namenode
Namenode directory          : /opt/hadoop/hdfs/namenode
Datanode directory          : /opt/hadoop/hdfs/datanode
Jobtracker host             : Hadoop1    #此处Hadoop1为Jobtracker
Mapreduce directory         : /opt/hadoop/mapred
Task scheduler              : org.apache.hadoop.mapred.JobQueueTaskScheduler
JAVA_HOME directory         : /usr/java/jdk1.7.0_25
Create dirs/copy conf files : Y

Proceed with generate configuration? (y/N) y
chown: 无效的组: "root:hadoop"    #此错是由此配置脚本输出的,由于笔者喜欢至高的权利,所有用root来部署,所以此错误就当没报过
chown: 无效的组: "root:hadoop"
chown: 无效的组: "root:hadoop"

Configuration file has been generated in:

/usr/local/hadoop/conf/core-site.xml
/usr/local/hadoop/conf/hdfs-site.xml
/usr/local/hadoop/conf/mapred-site.xml
/usr/local/hadoop/conf/hadoop-env.sh
/usr/local/hadoop/conf/hadoop-policy.xml
/usr/local/hadoop/conf/commons-logging.properties
/usr/local/hadoop/conf/taskcontroller.cfg
/usr/local/hadoop/conf/capacity-scheduler.xml
/usr/local/hadoop/conf/log4j.properties
/usr/local/hadoop/conf/hadoop-metrics2.properties

 to /usr/local/hadoop/conf on all nodes, and proceed to run hadoop-setup-hdfs.sh on namenode.

至此所有的通用部署已经Ok了
下边配置ssh互信
在Hadoop*上(any node)
[root@Hadoop1 ~]# cd ~/.ssh && ssh-keygen  -t  rsa
在Hadoop1上
[root@Hadoop1 .ssh]# ssh hadoop1 cat ~/.ssh/id_rsa.pub >> authorized_keys
[root@Hadoop1 .ssh]# ssh hadoop2 cat ~/.ssh/id_rsa.pub >> authorized_keys
[root@Hadoop1 .ssh]# ssh hadoop3 cat ~/.ssh/id_rsa.pub >> authorized_keys
[root@Hadoop1 .ssh]# ssh hadoop4 cat ~/.ssh/id_rsa.pub >> authorized_keys
[root@Hadoop1 .ssh]# ssh hadoop5 cat ~/.ssh/id_rsa.pub >> authorized_keys
[root@Hadoop1 .ssh]# ssh hadoop6 cat ~/.ssh/id_rsa.pub >> authorized_keys
[root@Hadoop1 .ssh]# ssh hadoop7 cat ~/.ssh/id_rsa.pub >> authorized_keys
[root@Hadoop1 .ssh]# cat authorized_keys |wc -l #统计一下有没有遗漏的机器
7
[root@Hadoop1 .ssh]# cat known_hosts |wc -l    #同上
7
[root@Hadoop1 .ssh]# scp known_hosts authorized_keys hadoop2:~/.ssh/                                                                         100% 2758    
[root@Hadoop1 .ssh]# scp known_hosts authorized_keys hadoop3:~/.ssh/
[root@Hadoop1 .ssh]# scp known_hosts authorized_keys hadoop4:~/.ssh/   
[root@Hadoop1 .ssh]# scp known_hosts authorized_keys hadoop5:~/.ssh/  
[root@Hadoop1 .ssh]# scp known_hosts authorized_keys hadoop6:~/.ssh/  
[root@Hadoop1 .ssh]# scp known_hosts authorized_keys hadoop7:~/.ssh/
验证在某个节点上ssh任意一个节点五密码访问
在Hadoop2上
[root@Hadoop2 .ssh]# ssh hadoop4 date
2013年 07月 10日 星期三 03:15:05 GMT
[root@Hadoop2 .ssh]# ssh hadoop7 date
2013年 07月 10日 星期三 03:15:08 GMT
[root@Hadoop2 .ssh]# ssh hadoop1 date
2013年 07月 10日 星期三 03:15:11 GMT
在Hadoop7上
[root@Hadoop7 .ssh]# ssh hadoop3 date
2013年 07月 10日 星期三 03:16:44 GMT
[root@Hadoop7 .ssh]# ssh hadoop5 date
2013年 07月 10日 星期三 03:16:50 GMT
[root@Hadoop7 .ssh]# ssh hadoop1 date
2013年 07月 10日 星期三 03:16:54 GMT

启动Hadoop cluster
在Hadoop1上
[root@Hadoop1 root]# start-dfs.sh
[root@Hadoop1 root]# start-mapred.sh

查看HDFS
http://188.188.3.241:50070
查看JOB
http://188.188.3.241:50030
**********************************************************
    我所遇到的错误和解决办法已将错误整合到以上操作步骤中了
**********************************************************
错误
2013-07-09 08:46:59,301 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Hadoop1/188.188.3.241:8020. Already tr
ied 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
日志
2013-07-09 08:43:45,812 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: NameNode is not formatted.
解决
[root@Hadoop1 root]# hadoop namenode -format

错误
2013-07-09 08:38:26,370 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /opt/hadoop/hdfs/datanode, expected: rwx------, while actual: rwxr-xr-x
2013-07-09 08:38:26,370 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid.
解决
chmod -R 700 /opt/hadoop/hdfs

**********************************************************


接下篇http://blog.csdn.net/caiwenguang1992/article/details/9307299


#####################################################

本文属笔者原创

作者:john

转载请注明出处

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值