1 下载并安装在linux,并且需要java环境(直接下载tar不需要编译)
2 (1)修改 hadoop(文件名)/etc/hadoop 下 hadoop-env.sh 将 java_home=${java_home}替换成本地java路径(echo $JAVA_HOME)
(2)修改 core.sit.xml下在configuration下增加 (hfsd.site.xml有默认值)
<property>
<name>fs.defaultFS</name>
<!--nodename节点目的-->
<value>hdfs://192.168.244.130:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/root/hadoopdata</value>
</property>
(3)修改mapred-site.xml.template并重命名为mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
(4)修改yarn-sit.xml(我的为空)
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>192.168.244.130</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
3 增加环境变量编辑etc/profile文件,重启source /etc/profile
<!--一下代码都是在原来代码增加的-->
HADOOP_HOME=/usr/local/hadoop(自己的路径)
在PATH加 PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export $HADOOP_HOME $PATH
4 在/etc/host映射主机和ip
5 格式 hdfs文件
hadoop namenode -format
以上内容需要把配置信息配置到其他集群节点。
6 在主节点所在主机启动(主节点即配置core-sit.xml填写的主机)
启动方式:在sbin目录下输入 hadoop-daemon.sh start namenode启动
查看192.168.244.130:50070(主机)情况,此时活跃的node为0.
现在在其他集群节点启动hadoop。启动方式为 hadoop-daemon.sh start datanode (注意最后的单词别写错)在查看92.168.244.130:50070可以看到活跃为1
待续-----------------------------------