机器类型,64位centos;列表:
master
slave1
slave2
slave3
slave4
0 准备工作
0.1 修改每台机器的hostname为列表中的名字
如master就做如下修改:
[root@master ~]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master
[root@master ~]# vi /etc/hosts
#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1 localhost
192.168.11.99 master
192.168.12.174 slave1
192.168.12.178 slave2
192.168.12.18 slave3
192.168.11.94 slave4
保存并重启host就变成了我们设置的名字。
0.2 打通master到slave的ssh免验证登录
简述一下ssh免验证登录的原理:要从主机A免验证登录到主机B,首先在在A上运行ssh-keygen,生成私钥文件id_rsa和对应的公钥文件id_rsa.pub,然后把公钥文件中的内容添加到B主机的~/.ssh/authorized_keys文件中,就完成了A免验证登录到B的设置。
查看一下master登录到slave1的效果:
[root@master ~]# ssh slave1
Last login: Mon Nov 24 16:45:15 2014 from master
[root@slave1 ~]#
0.3 在每台机器上安装jdk
去oracle下载jdk的rpm包并安装,我装的是1.8,比较适合的应该是1.7的版本:
rpm -ivh jdk-8u25-linux-x64.rpm
vi /etc/profile,设置环境变量,添加path和classpath:
export JAVA_HOME=/usr/java/jdk1.8.0_25
export PATH=/usr/hbase/bin:/usr/hadoop/bin:/usr/hadoop/sbin:$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
使修改生效:
source /etc/profile
1 安装配置hadoop
1.1 下载解压hadoop
其实应该编译个64位的版本,没弄...
cp hadoop-2.5.1.tar.gz /usr/
tar -zxf hadoop-2.5.1.tar.gz
mv hadoop-2.5.1 hadoop
修改环境变量:
export PATH=/usr/hbase/bin:/usr/hadoop/bin:/usr/hadoop/sbin:$JAVA_HOME/bin:$PATH
1.2 修改配置文件
配置文件:
[root@slave1 hadoop]# pwd
/usr/hadoop/etc/hadoop
[root@slave1 hadoop]# ll
total 140
drwxr-xr-x 2 root root 4096 Nov 15 14:31 .
drwxr-xr-x 3 10011 10011 4096 Oct 28 16:42 ..
-rw-r--r-- 1 root root 3589 Nov 13 13:57 capacity-scheduler.xml
-rw-r--r-- 1 root root 1335 Nov 13 13:57 configuration.xsl
-rw-r--r-- 1 root root 318 Nov 13 13:57 container-executor.cfg
-rw-r--r-- 1 root root 1339 Nov 13 13:57 core-site.xml
-rw-r--r-- 1 root root 3670 Nov 13 13:57 hadoop-env.cmd
-rw-r--r-- 1 root root 3452 Nov 13 13:57 hadoop-env.sh
-rw-r--r-- 1 root root 1774 Nov 13 13:57 hadoop-metrics2.properties
-rw-r--r-- 1 root root 2490 Nov 13 13:57 hadoop-metrics.properties
-rw-r--r-- 1 root root 9201 Nov 13 13:57 hadoop-policy.xml
-rw-r--r-- 1 root root 2372 Nov 13 13:57 hdfs-site.xml
-rw-r--r-- 1 root root 1449 Nov 13 13:57 httpfs-env.sh
-rw-r--r-- 1 root root 1657 Nov 13 13:57 httpfs-log4j.properties
-rw-r--r-- 1 root root 21 Nov 13 13:57 httpfs-signature.secret
-rw-r--r-- 1 root root 620 Nov 13 13:57 httpfs-site.xml
-rw-r--r-- 1 root root 11118 Nov 13 13:57 log4j.properties
-rw-r--r-- 1 root root 938 Nov 13 13:57 mapred-env.cmd
-rw-r--r-- 1 root root 1383 Nov 13 13:57 mapred-env.sh
-rw-r--r-- 1 root root 4113 Nov 13 13:57 mapred-queues.xml.template
-rw-r--r-- 1 root root 844 Nov 13 13:57 mapred-site.xml
-rw-r--r-- 1 root root 758 Nov 13 13:57 mapred-site.xml.template
-rw-r--r-- 1 root root 29 Nov 13 13:57 slaves
-rw-r--r-- 1 root root 2316 Nov 13 13:57 ssl-client.xml.example
-rw-r--r-- 1 root root 2268 Nov 13 13:57 ssl-server.xml.example
-rw-r--r-- 1 root root 2237 Nov 13 13:57 yarn-env.cmd
-rw-r--r-- 1 root root 4606 Nov 13 13:57 yarn-env.sh
-rw-r--r-- 1 root root 1875 Nov 13 13:57 yarn-site.xml
-rw-r--r-- 1 root root 1087 Oct 31 16:42 yarn-site.xml.bak
要修改的文件列表:
hadoop-env.sh
yarn-env.sh
core-site.xml
hdfs-site.xml
mapred-site.xml
slaves
yarn-site.xml
a. hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_25
b. yarn-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_25
c. slaves
添加slave主机名,每行一个:
slave1
slave2
slave3
slave4
d. core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/root/tmp</value>
<description>A base for other temporary directories.</description>
</property>
</configuration>
e. hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/root/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/root/dfs/data</value>
</property>
</configuration>
f. mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
g. yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
</configuration>
修改完成后,把相同配置拷贝到每台机器中:
scp -r /usr/hadoop/etc/hadoop root@slave*:/usr/hadoop/etc/
2. 测试
start-dfs.sh
start-yarn.sh
jps