一、安装Centos6.3
略
二、安装JDK
下面在Master节点中安装JDK,其他的节点按同样的方法安装JDK,当然也可以把Master中java文件夹复制到slaves节点中相应的目录,下面所有的操作都以root身份进行:
1、将下载的jdk-6u27-linux-i586.bin拷贝到Master节点的/usr/local/java目录下(先在/usr/local/目录下新建java目录);
2、解压JDK到当前目录,并删除当前包
[hadoop@Master java]$ tar -zxvfjdk-6u27-linux-i586.bin
[hadoop@Master java]$ rm -rf jdk-6u27-linux-i586.bin
3、配置环境变量
[hadoop@Master local]$ vi /etc/profile
在profile配置文件中加入:
#set java path
export JAVA_HOME=/usr/local/java/jdk1.6.0_27
export JRE_HOME=$JAVA_HOME/jre
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
exportCLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
使配置文件生效:
[hadoop@Master local]$ source /etc/profile
或者:
[hadoop@Master local]$ . /etc/profile
检验JDK是否安装成功:
[hadoop@Master local]$ java -versionjava version "1.6.0_27"Java(TM) SE Runtime Environment (build 1.6.0_27-b07)Java HotSpot(TM) Client VM (build 20.2-b06, mixed mode, sharing)[hadoop@Master local]$
三、节点之间无密码登录
SSH无密码登录
四、hadoop安装与配置
现在Master机器上安装和配置,且都需要以root身份进行:
1 安装hadoop
1)将hadoop-2.2.0.tar.gz拷贝到Master.hadoop的“/usr/local”目录下;
2)解压hadoop-2.2.0.tar.gz
[root@Master local]# tar -zxvf hadoop-2.2.0.tar.gz
3) 将hadoop-2.2.0.tar.gz更名为hadoop
[root@Master local]# mv hadoop-2.2.0 hadoop
4)删除hadoop-2.2.0.tar.gz安装包
[root@Master local]# rm -rf hadoop-2.2.0.tar.gz
5)将文件夹hadoop的读权限分配给hadoop用户
[root@Master local]# chown -R hadoop:hadoop hadoop
6)设置hadoop的路径(配置/etc/profile)
[root@Master hadoop]# vi /etc/profile
在文件中加入:
#set hadoop path
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
7)重启/etc/profile
用source /etc/profile 或者 . /etc/profile 命令
[root@Master hadoop]# . /etc/profile
2 配置hadoop
hadoop-2.2.0所有的配置文件都位于/usr/local/hadoop/etc/hadoop目录下,例如:hadoop-env.sh、yarn-env.sh、core-site.xml、hdfs-site.xml 、 mapred-site.xml.template、 yarn-site.xml
以下操作都以hadoop身份进行:
(其实第1)步没必要配置,因为在后面测试时,执行hadoopnamenode -format命令格式化时,会自动创建下面这些目录)
注意:后面我们配置的dfs.namenode.name.dir和dfs.datanode.data.dir的目录不是在/usr/local/hadoop下,我们放在/home目录下,因为根目录所挂载的分区内存不足,如果配置在/usr/local/hadoop目录下,将无法上传文件到hdfs文件系统
1)以hadoop用户在/usr/local/hadoop/创建“tmp”,“dfs”文件夹,在dfs下创建“name”和“data”
[hadoop@Master hadoop]$ pwd/usr/local/hadoop
[hadoop@Master hadoop]$ mkdir tmp[hadoop@Master hadoop]$ mkdir dfs[hadoop@Master hadoop]$ lltotal 60drwxr-xr-x. 2 hadoop hadoop 4096 Mar 31 04:49 bindrwxrwxr-x. 2 hadoop hadoop 4096 Aug 2 05:15 dfs................drwxr-xr-x. 2 hadoop hadoop 4096Aug 2 04:34 tmp
[hadoop@Master hadoop]$ cd dfs[hadoop@Master dfs]$ lltotal 0[hadoop@Master dfs]$ mkdir name[hadoop@Master dfs]$ mkdir data[hadoop@Master dfs]$ lltotal 8drwxrwxr-x. 2 hadoop hadoop 4096 Aug 2 05:17 datadrwxrwxr-x. 2 hadoop hadoop 4096 Aug 2 05:17 name
2)配置hadoop-env.sh
[root@Master hadoop]# vi hadoop-env.sh
加入:
# The java implementation to use.
export JAVA_HOME=/usr/local/java/jdk1.6.0_27
3)配置yarn-env.sh
[hadoop@Master hadoop]$ vi yarn-env.sh
加入:
# some Java parameters
exportJAVA_HOME=/usr/local/java/jdk1.6.0_27
4)配置vi slaves
[root@Master hadoop]# vi slaves
加入:
#localhost
192.168.137.128
192.168.137.129
5)配置core-site.xml
[root@Master hadoop]# vi core-site.xml
在<configuration></configuration>之间加入:
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.137.120:9000/</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
<description>Abase for other temporarydirectories.</description>
</property>
6)配置hdfs-site.xml
[hadoop@Master hadoop]$ vi hdfs-site.xml
在<configuration></configuration>之间加入:
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>Master.Hadoop:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/${user.name}/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/${user.name}/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
7)配置mapred-site.xml
[hadoop@Master hadoop]$ mv mapred-site.xml.template mapred-site.xml
[hadoop@Master hadoop]$ vi mapred-site.xml
在<configuration></configuration>之间加入:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>Master.Hadoop:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>Master.Hadoop:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/mr-history/tmp</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/mr-history/done</value>
</property>
8)配置yarn-site.xml
[hadoop@Master hadoop]$ vi yarn-site.xml
在<configuration></configuration>之间加入:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>Master.Hadoop:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>Master.Hadoop:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>Master.Hadoop:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>Master.Hadoop:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>Master.Hadoop:8088</value>
</property>
配置其他机器:
将Master中配置好的hadoop的文件夹“/usr/local/hadoop”复制到所有的slave的“/usr/local”下,下面以slave1为例,slave2以同样的方法:
1)复制到其他slave机器上
复制到slave1:
[hadoop@Master hadoop]$ scp -r /usr/local/hadoop root@192.168.137.128:/usr/local
2)以root身份修改hadoop文件夹的所有者和所属组
[root@Salve1 local]# pwd/usr/local[root@Salve1 local]# chown -R hadoop:hadoop hadoop
3)关闭集群中所有的机器的防火墙
Master:
[root@Master local]# chkconfig iptables off[root@Master local]# service iptables statusiptables: Firewall is not running.
Slave1:
[root@Salve1 local]# chkconfig iptables off[root@Salve1 local]# service iptables statusiptables: Firewall is not running.
Slave2:
[root@Salve2 local]# chkconfig iptables off[root@Salve2 local]# service iptables statusiptables: Firewall is not running.
4)同步系统时间和硬件时间
查看系统时间和硬件时间:
[hadoop@Master hadoop]$ date;hwclock -r
Tue Aug 19 20:27:58 CST 2014
Tue 19 Aug 2014 08:20:52 PM CST -0.286125 seconds
从时间服务器time.nist.gov同步系统时间:
[hadoop@Master hadoop]$ ntpdate time.nist.gov
将系统时间同步到硬件时间:
[hadoop@Master hadoop]$ hwclock -w
扩展:
linux中防火墙的关闭:
1、重启后有效
chkconfig iptables off //关闭防火墙
chkconfig iptables on //开启防火墙
2、重启后失效
service iptables start //开启防火墙
service iptables stop //关闭防火墙
3、查看防火墙状态
service iptables status
下面操作在Master中以hadoop用户进行:
1)格式化
[hadoop@Master hadoop]$ hadoop namenode -format
2)启动所有节点
[hadoop@Master hadoop]$ start-all.sh
3)查看进程
[hadoop@Master hadoop]$ jps17483 SecondaryNameNode28569 Jps17627 ResourceManager17317 NameNode
[hadoop@Salve1 ~]$ jps14967 Jps13251 NodeManager13160 DataNode
[hadoop@Salve2 ~]$ jps12020 DataNode13779 Jps12113 NodeManager
<pre name="code" class="html">[hadoop@Master hadoop]$ hadoop dfsadmin -reportDEPRECATED: Use of this script to execute hdfs command is deprecated.Instead use the hdfs command for it.14/08/19 20:40:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableConfigured Capacity: 31334506496 (29.18 GB)Present Capacity: 29394505728 (27.38 GB)DFS Remaining: 29394436096 (27.38 GB)DFS Used: 69632 (68 KB)DFS Used%: 0.00%Under replicated blocks: 0Blocks with corrupt replicas: 0Missing blocks: 0-------------------------------------------------Datanodes available: 2 (2 total, 0 dead)Live datanodes:Name: 192.168.137.128:50010 (Salve1.Hadoop)Hostname: Salve1.HadoopDecommission Status : NormalConfigured Capacity: 15667253248 (14.59 GB)DFS Used: 45056 (44 KB)Non DFS Used: 969990144 (925.05 MB)DFS Remaining: 14697218048 (13.69 GB)DFS Used%: 0.00%DFS Remaining%: 93.81%Last contact: Tue Aug 19 20:40:56 CST 2014Name: 192.168.137.129:50010 (Salve2.Hadoop)Hostname: Salve2.HadoopDecommission Status : NormalConfigured Capacity: 15667253248 (14.59 GB)DFS Used: 24576 (24 KB)Non DFS Used: 970010624 (925.07 MB)DFS Remaining: 14697218048 (13.69 GB)DFS Used%: 0.00%DFS Remaining%: 93.81%Last contact: Tue Aug 19 20:40:57 CST 2014
五、测试
1、在Master节点中以hadoop身份进行操作
在Master节点的“/opt”目录下新建file1.txt和file2.txt两个文件,并在文件中加入内容:
[hadoop@Master hadoop]$ cd /opt[hadoop@Master opt]$ lltotal 24-rwxrwxrwx 1 hadoop hadoop 66 Aug 8 16:30 file1.txt-rwxrwxrwx 1 hadoop hadoop 67 Aug 8 16:31 file2.txtdrwxr-xr-x 2 root root 4096 Aug 4 15:23 tools[hadoop@Master opt]$[hadoop@Master opt]$ cat file1.txtHello, i love codingare you ok?Hello, i love hadoopare you ok?[hadoop@Master opt]$ cat file2.txtHello, i love codingare you ok ?Hello i love hadoopare you ok ?[hadoop@Master opt]$
2、新建HDFS文件系统的目录
[hadoop@Master ~]$ hadoop fs -mkdir -p /home/input
3、上传文件到HDFS文件系统目录中
上传:
[hadoop@Master ~]$ hadoop fs -put /opt/file*.txt /home/input
查看是上传成功:
[hadoop@Master ~]$ hadoop fs -ls /home/input14/08/19 21:42:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableFound 2 items-rw-r--r-- 1 hadoop supergroup 66 2014-08-19 15:27 /home/input/file1.txt-rw-r--r-- 1 hadoop supergroup 67 2014-08-19 15:27 /home/input/file2.txt[hadoop@Master ~]$
至此,全部完成,一般报错的几个关键地方:版本,防火墙,系统时间,日志。
本文版权所有,如需转载,请声明,并给出原文地址!http://blog.csdn.net/hbs321123/article/details/38689523