实验环境 (三台机)
一台做master (存放namenode、jobtracker)
另外两台做slave (存放tasktracker、datanode)
IP地址规划如下
主机名 | ip地址 | 备注 |
master | 192.168.1.240 | namenode节点 |
slave1 | 192.168.1.241 | datanode节点 |
slave2 | 192.168.1.242 | datanode节点 |
第一步:在虚拟机中安装linux系统
OS版本 :redhat linux as4 update7
安装os的时候 选择安装了所有的软件包
第二步:分别在三台机器上安装jdk
Jdk (jdk-6u37-linux-i586-rpm.bin下载地址如下:
http://www.oracle.com/technetwork/java/javase/downloads/jdk6u37-downloads-1859587.html
上传下载后的jdk到linux服务器,并拷贝到 /usr/local/jdk1.6.0目录中
root@master ~]# cp jdk-6u37-linux-i586-rpm.bin /usr/local/jdk1.6.0/
[root@master ~]#
[root@master ~]#
[root@master ~]# cd /usr/local/jdk1.6.0/
[root@master jdk1.6.0]#
[root@master jdk1.6.0]# ls -l
total 67072
-rw-r--r-- 1 root root 68604823 Nov 20 14:28 jdk-6u37-linux-i586-rpm.bin
[root@master jdk1.6.0]# chmod 777 jdk-6u37-linux-i586-rpm.bin
[root@master jdk1.6.0]#
[root@master jdk1.6.0]# ./jdk-6u37-linux-i586-rpm.bin
Unpacking...
Checksumming...
Extracting...
UnZipSFX 5.50 of 17 February 2002, by Info-ZIP (Zip-Bugs@lists.wku.edu).
inflating: jdk-6u37-linux-i586.rpm
inflating: sun-javadb-common-10.6.2-1.1.i386.rpm
inflating: sun-javadb-core-10.6.2-1.1.i386.rpm
inflating: sun-javadb-client-10.6.2-1.1.i386.rpm
inflating: sun-javadb-demo-10.6.2-1.1.i386.rpm
inflating: sun-javadb-docs-10.6.2-1.1.i386.rpm
inflating: sun-javadb-javadoc-10.6.2-1.1.i386.rpm
Preparing... ########################################### [100%]
package jdk-1.6.0_37-fcs is already installed
Java(TM) SE Development Kit 6 successfully installed.
Product Registration is FREE and includes many benefits:
* Notification of new versions, patches, and updates
* Special offers on Oracle products, services and training
* Access to early releases and documentation
Product and system data will be collected. If your configuration
supports a browser, the JDK Product Registration form will
be presented. If you do not register, none of this information
will be saved. You may also register your JDK later by
opening the register.html file (located in the JDK installation
directory) in a browser.
For more information on what data Registration collects and
how it is managed and used, see:
http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html
Press Enter to continue.....
Done.
java安装以后,默认路径安装在/usr/java目录下,这个比较奇怪。。。。
设置java路径:修改 /etc/profile全局配置文件,添加如下内容
export JAVA_HOME=/usr/java/jdk1.6.0_37
export PATH="$JAVA_HOME/bin:$PATH"
export CLASSPATH=".:$JAVA_HOME/lib:$CLASSPATH"
查看新安装的java版本
[root@master ~]# java -version
java version "1.6.0_37"
Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
Java HotSpot(TM) Client VM (build 20.12-b01, mixed mode, sharing)
第三步: 安装hadoop前期准备
1:分别修改三台主机的 /etc/hosts文件 ,添加如下行
192.168.1.240 master
192.168.1.241 slave1
192.168.1.242 slave2
2:分别在三台主机上建立操作系统用户组及用户 hadoop(用户名和组名可以自己选定)
三个节点的用户组id及用户名相同
[root@master ~]# groupadd hadoop
[root@master ~]# useradd -g hadoop -d /home/hadoop hadoop
[root@master ~]# passwd hadoop
Changing password for user hadoop.
New UNIX password:
BAD PASSWORD: it is based on a dictionary word
Retype new UNIX password:
passwd: all authentication tokens updated successfully.
[root@master ~]#
3:分别在三台主机上配置ssh,免密码登陆
以hadoop用户配置ssh ,在生成authorized_keys文件时候用cp 命令,而用
cat *.pub >> authorized_keys 时,ssh一直提示手动输入密码
master 节点上
[hadoop@master ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
e9:f2:fc:6b:1d:0b:93:d8:40:2c:d8:37:0a:b9:59:b6 hadoop@master
[hadoop@master ~]$
[hadoop@master ~]$ cd .ssh
[hadoop@master .ssh]$ ls -l
total 8
-rw------- 1 hadoop hadoop 883 Nov 20 20:52 id_rsa
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 20:52 id_rsa.pub
[hadoop@master .ssh]$
[hadoop@master .ssh]$
[hadoop@master .ssh]$ cp id_rsa.pub authorized_keys
[hadoop@master .ssh]$
[hadoop@master .ssh]$
[hadoop@master .ssh]$ ls -l
total 12
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 20:53 authorized_keys
-rw------- 1 hadoop hadoop 883 Nov 20 20:52 id_rsa
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 20:52 id_rsa.pub
slave1节点上
[hadoop@slave1 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
71:d6:c2:93:87:4d:e1:6f:98:fa:81:d6:73:b0:ef:25 hadoop@slave1
[hadoop@slave1 ~]$
[hadoop@slave1 ~]$ cd .ssh
[hadoop@slave1 .ssh]$ ls -l
total 8
-rw------- 1 hadoop hadoop 887 Nov 20 21:01 id_rsa
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 21:01 id_rsa.pub
[hadoop@slave1 .ssh]$
[hadoop@slave1 .ssh]$ cp id_rsa.pub authorized_keys
[hadoop@slave1 .ssh]$
[hadoop@slave1 .ssh]$ ls -l
total 12
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 21:02 authorized_keys
-rw------- 1 hadoop hadoop 887 Nov 20 21:01 id_rsa
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 21:01 id_rsa.pub
[hadoop@slave1 .ssh]$
[hadoop@slave1 .ssh]$
[hadoop@slave1 .ssh]$ scp authorized_keys hadoop@master:/home/hadoop/.ssh/s1_keys
The authenticity of host 'master (192.168.1.240)' can't be established.
RSA key fingerprint is c9:b6:f6:08:29:9e:bc:ff:5f:89:a2:38:66:1e:f3:df.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,192.168.1.240' (RSA) to the list of known hosts.
hadoop@master's password:
authorized_keys 100% 223 0.2KB/s 00:00
slave2 节点上
[root@slave2 ~]# su - hadoop
[hadoop@slave2 ~]$
[hadoop@slave2 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
92:ff:5d:cd:46:a9:5d:43:12:1d:d0:a4:2e:4c:f8:67 hadoop@slave2
[hadoop@slave2 ~]$
[hadoop@slave2 ~]$ cd .ssh
[hadoop@slave2 .ssh]$ ls -l
total 8
-rw------- 1 hadoop hadoop 887 Nov 20 21:04 id_rsa
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 21:04 id_rsa.pub
[hadoop@slave2 .ssh]$
[hadoop@slave2 .ssh]$ cp id_rsa.pub authorized_keys
[hadoop@slave2 .ssh]$
[hadoop@slave2 .ssh]$ ls -l
total 12
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 21:04 authorized_keys
-rw------- 1 hadoop hadoop 887 Nov 20 21:04 id_rsa
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 21:04 id_rsa.pub
[hadoop@slave2 .ssh]$ scp authorized_keys hadoop@master:/home/hadoop/.ssh/s2_keys
The authenticity of host 'master (192.168.1.240)' can't be established.
RSA key fingerprint is c9:b6:f6:08:29:9e:bc:ff:5f:89:a2:38:66:1e:f3:df.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,192.168.1.240' (RSA) to the list of known hosts.
hadoop@master's password:
authorized_keys 100% 223 0.2KB/s 00:00
[hadoop@slave2 .ssh]$
然后再切换到master上合并公钥
[hadoop@master ~]$ cd .ssh
[hadoop@master .ssh]$ ls -l
total 24
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 20:53 authorized_keys
-rw------- 1 hadoop hadoop 883 Nov 20 20:52 id_rsa
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 20:52 id_rsa.pub
-rw-r--r-- 1 hadoop hadoop 230 Nov 20 20:54 known_hosts
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 21:09 s1_keys
-rw-r--r-- 1 hadoop hadoop 223 Nov 20 21:11 s2_keys
[hadoop@master .ssh]$
[hadoop@master .ssh]$ cat s1_keys >> authorized_keys
[hadoop@master .ssh]$
[hadoop@master .ssh]$ cat s2_keys >> authorized_keys
[hadoop@master .ssh]$
[hadoop@master .ssh]$
[hadoop@master .ssh]$
[hadoop@master .ssh]$ scp authorized_keys hadoop@slave1:/home/hadoop/.ssh/authorized_keys
The authenticity of host 'slave1 (192.168.1.241)' can't be established.
RSA key fingerprint is c9:b6:f6:08:29:9e:bc:ff:5f:89:a2:38:66:1e:f3:df.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave1,192.168.1.241' (RSA) to the list of known hosts.
hadoop@slave1's password:
authorized_keys 100% 669 0.7KB/s 00:00
[hadoop@master .ssh]$
[hadoop@master .ssh]$
[hadoop@master .ssh]$ scp authorized_keys hadoop@slave2:/home/hadoop/.ssh/authorized_keys
The authenticity of host 'slave2 (192.168.1.242)' can't be established.
RSA key fingerprint is c9:b6:f6:08:29:9e:bc:ff:5f:89:a2:38:66:1e:f3:df.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave2,192.168.1.242' (RSA) to the list of known hosts.
hadoop@slave2's password:
authorized_keys 100% 669 0.7KB/s 00:00
[hadoop@master .ssh]$
4:在三个节点上测试ssh是否ok
master节点
[hadoop@master .ssh]$ ssh slave1
Last login: Tue Nov 20 21:03:24 2012 from slave1
[hadoop@slave1 ~]$
[hadoop@master .ssh]$ ssh slave2
Last login: Tue Nov 20 21:08:05 2012 from slave2
[hadoop@slave2 ~]$
slave1节点
[hadoop@slave1 ~]$ ssh master
Last login: Tue Nov 20 21:18:19 2012 from slave2
[hadoop@master ~]$
[hadoop@slave1 ~]$ ssh slave2
Last login: Tue Nov 20 21:19:35 2012 from slave1
[hadoop@slave2 ~]$
slave2节点
[hadoop@slave2 ~]$ ssh master
Last login: Tue Nov 20 20:54:49 2012 from master
[hadoop@master ~]$
[hadoop@slave2 ~]$ ssh slave1
Last login: Tue Nov 20 21:18:09 2012 from slave2
[hadoop@slave1 ~]$
第四步:安装hadoop
将安装包上传到master节点上,并解压
[root@master ~]# cd /home/hadoop/
[root@master hadoop]# ls -l
total 43580
-rw-r--r-- 1 root root 44575568 Nov 20 13:32 hadoop-0.20.2.tar.gz
[root@master hadoop]# chown hadoop:hadoop hadoop-0.20.2.tar.gz
[root@master hadoop]#
[root@master hadoop]# su - hadoop
[hadoop@master ~]$ ls -l
total 43580
-rw-r--r-- 1 hadoop hadoop 44575568 Nov 20 13:32 hadoop-0.20.2.tar.gz
[hadoop@master ~]$ gunzip hadoop-0.20.2.tar.gz
[hadoop@master ~]$
[hadoop@master ~]$ ls -l
total 132648
-rw-r--r-- 1 hadoop hadoop 135690240 Nov 20 13:32 hadoop-0.20.2.tar
[hadoop@master ~]$ tar -xvf hadoop-0.20.2.tar
[hadoop@master ~]$ ls -l
total 132652
drwxr-xr-x 12 hadoop hadoop 4096 Feb 19 2010 hadoop-0.20.2
-rw-r--r-- 1 hadoop hadoop 135690240 Nov 20 13:32 hadoop-0.20.2.tar
[hadoop@master ~]$
解压完成之后,进入hadoop-0.20.2/conf目录中,进行hadoop的配置
1:修改core-site.xml文件
[hadoop@master conf]$ vi core-site.xml
注意格式必须如下,行间不能有空格,否则格式化的时候报错
fs.default.name
hdfs://master:9000
2:修改hdfs-site.xml 文件
手动创建/home/hadoop/data 目录存储hadoop文件
[hadoop@master conf]$ vi hdfs-site.xml
这里由于我们采用的datanode有两个,所以 dfs.replication 的value值 为 2
在两个 中间添加如下内容
dfs.data.dir
>/home/hadoop/data
dfs.replication
2
3:修改mapred-site.xml 文件
[hadoop@master conf]$ vi mapred-site.xml
在两个 中间添加如下内容
mapred.job.tracker
master:9001
4:修改conf/slaves文件
添加两个datanode 的主机名
[hadoop@master conf]$ vi slaves
slave1
slave2
5:修改conf/masters文件
添加节点的主机名
[hadoop@master conf]$ vi masters
master
6:修改hadoop-env.sh
启用JAVA_HOME,路径指定为安装的java路径
# The java implementation to use. Required.
export JAVA_HOME=/usr/java/jdk1.6.0_37
所有这些配置完成之后,将mater上面hadoop-0.20.2目录下 的配置远程复制到slave1和slave2的节点上
在 master节点上执行
[hadoop@master ~]$ scp -r hadoop-0.20.2 hadoop@slave1:/home/hadoop
[hadoop@master ~]$ scp -r hadoop-0.20.2 hadoop@slave2:/home/hadoop
7:格式化 namenode
在mater节点上 执行format命令
[hadoop@master bin]$ pwd
/home/hadoop/hadoop-0.20.2/bin
[hadoop@master bin]$ ls
hadoop hadoop-daemons.sh start-all.sh start-mapred.sh stop-dfs.sh
hadoop-config.sh rcc start-balancer.sh stop-all.sh stop-mapred.sh
hadoop-daemon.sh slaves.sh start-dfs.sh stop-balancer.sh
[hadoop@master bin]$
[hadoop@master bin]$ ./hadoop namenode -format
12/11/21 14:17:43 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.1.240
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
12/11/21 14:17:44 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
12/11/21 14:17:44 INFO namenode.FSNamesystem: supergroup=supergroup
12/11/21 14:17:44 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/11/21 14:17:44 INFO common.Storage: Image file of size 96 saved in 0 seconds.
12/11/21 14:17:44 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
12/11/21 14:17:44 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.1.240
************************************************************/
8:启动hadoop
在master节点上启动 hdfs 和 mapreduce 守护进程
[hadoop@master bin]$ pwd
/home/hadoop/hadoop-0.20.2/bin
[hadoop@master bin]$ ls
hadoop hadoop-daemons.sh start-all.sh start-mapred.sh stop-dfs.sh
hadoop-config.sh rcc start-balancer.sh stop-all.sh stop-mapred.sh
hadoop-daemon.sh slaves.sh start-dfs.sh stop-balancer.sh
[hadoop@master bin]$
[hadoop@master bin]$ ./start-all.sh
starting namenode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-datanode-slave1.out
master: starting secondarynamenode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-jobtracker-master.out
slave2: starting tasktracker, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-tasktracker-slave2.out
slave1: starting tasktracker, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-tasktracker-slave1.out
[hadoop@master bin]$
然后可以在三个节点查看相关守护进程
master节点
[hadoop@master bin]$ /usr/java/jdk1.6.0_37/bin/jps
9412 NameNode
9541 SecondaryNameNode
9588 JobTracker
9733 Jps
slave1节点
[hadoop@slave1 bin]$ /usr/java/jdk1.6.0_37/bin/jps
9551 Jps
9422 TaskTracker
9359 DataNode
slave2节点
[hadoop@slave2 conf]$ /usr/java/jdk1.6.0_37/bin/jps
9150 Jps
8957 DataNode
9022 TaskTracker
第五步:测试安装完成的hadoop
50070 为namenode的httip服务器端口
50030 为jobtracker的http服务器端口
在ie中输入
hadoop-env.sh: 记录脚本要用的环境变量
core-site.xml: hadoop core的配置选项
hdfs-site.xml: hadoop守护进程的配置选项
mapred-site.xml: mapreduce守护进程的配置选项
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24862808/viewspace-749725/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/24862808/viewspace-749725/