前提说明:
安装目录为:/usr/local
hadoop 集群包括三台主机,相互都已配置好,ssh免秘钥登录
namenode和secondarynamenode和DataNode ResourceManager spark1 : 192.168.1.191
DataNode NodeManager spark2 : 192.168.1.192
DataNode NodeManager spark3 : 192.168.1.193
1.基本配置
将hadoop包进行解压缩:tar -zxvfhadoop-2.4.1.tar.gz
对hadoop目录进行重命名:mvhadoop-2.4.1 hadoop
配置hadoop相关环境变量
vi .bashrc
export HADOOP_HOME=/usr/local/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source .bashrc
在 hadoop/etc/hadoop目录下修改以下配置文件:
2.修改 core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://spark1:9000</value>
</property>
3.修改hdfs文件目录:hdfs-site.xml ,一定要在当前机器下的这个目前创建 /usr/local/ data文件夹目录
<property>
<name>dfs.name.dir</name>
<value>/usr/local/data/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/data/datanode</value>
</property>
<property>
<name>dfs.tmp.dir</name>
<value>/usr/local/data/tmp</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
4.修改mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
5.修改yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>spark1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
6.修改slaves文件,配置集群中的机器主机名
spark1
spark2
spark3
7.启动 hdfs
[root@spark1 data]# start-dfs.shStarting namenodes on [spark1]
spark1: starting namenode, logging to /usr/local/spark/hadoop/logs/hadoop-root-namenode-spark1.out
spark1: starting datanode, logging to /usr/local/spark/hadoop/logs/hadoop-root-datanode-spark1.out
spark2: starting datanode, logging to /usr/local/spark/hadoop/logs/hadoop-root-datanode-spark2.out
spark3: starting datanode, logging to /usr/local/spark/hadoop/logs/hadoop-root-datanode-spark3.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/spark/hadoop/logs/hadoop-root-secondarynamenode-spark1.out
[root@spark1 data]# jps
5915 SecondaryNameNode
6014 Jps
5772 DataNode
5654 NameNode
[root@spark2 local]# jps
1434 Jps
1377 DataNode
[root@spark3 local]# jps
1411 Jps
1354 DataNod
通过web浏览器访问50070端口
http://192.168.1.191:50070正常则表明hdfs集群搭建好
8.有时候会启动失败,一般的解决办法
1.删除所有主机上面的 data目录下面的文件
2.重新再 格式化一次 hadoop 磁盘
9.启动yarn集群
[root@spark1 data]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/spark/hadoop/logs/yarn-root-resourcemanager-spark1.out
spark2: starting nodemanager, logging to /usr/local/spark/hadoop/logs/yarn-root-nodemanager-spark2.out
spark3: starting nodemanager, logging to /usr/local/spark/hadoop/logs/yarn-root-nodemanager-spark3.out
spark1: starting nodemanager, logging to /usr/local/spark/hadoop/logs/yarn-root-nodemanager-spark1.out
[root@spark1 data]# jps
6468 Jps
5915 SecondaryNameNode
5772 DataNode
6091 ResourceManager
6186 NodeManager
5654 NameNode
[root@spark2 data]# jps
1898 Jps
1834 DataNode
[root@spark2 data]# jps
1834 DataNode
1935 NodeManager
2045 Jps
[root@spark3 data]# jps
1867 Jps
1803 DataNode
[root@spark3 data]# jps
1803 DataNode
1904 NodeManager
2014 Jps
通过web浏览器访问8088端口
http://192.168.1.191:8088正常则表明yarn集群搭建好
10.以上检测都显示正常,说明hadoop集群搭建成功