1.官网下载Hadoop2.7.6
2.远程登录到centos发送Hadoop安装文件。(目录自己决定,本文以放到/home目录下来讲解//不推荐!!所以我换到了/usr/local/hadoop/下)
3.解压
tar -xzvf hadoop-2.7.6.tar.gz (解压后为配置方便,修改了Hadoop-2.7.6文件夹名称为hadoop)
4.进入hadoop文件夹
5.修改hadoop环境变量
修改JAVA_HOME的位置(就是装jdk配置的那个)例如:export JAVA_HOME=/home/java/jdk1.8.0_142
6.把hadoop执行命令的路径加到PATH环境变量里面
vim /etc/profile
在最后一行加入 export PATH=$PATH:/home/hadoop/bin://home/hadoop/sbin
(根据hadoop的安装目录而定)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------->注意,上面的操作.目录是在/home目录下安装的,9/23更新为/usr/local文件夹下!!!!@@@请知悉
7.执行profile文件
source /etc/profile
8.进入hadoop文件夹
cd /usr/local/hadoop/etc/hadoop
修改环境变量: vi hadoop-env.sh ---->修改JAVA-HOME 为/usr/local/java/jdk1.8.0_171
修改core-site.xml
vi /usr/local/hadoop/etc/hadoop/core-site.xml
改成以下内容:
<configuration>
<!-- 指定hdfs的nameservice为ns1 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<!-- Size of read/write buffer used in SequenceFiles. -->
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<!-- 指定hadoop临时目录,自行创建 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/var/hadoop/tmp</value>
</property>
</configuration>
9.修改hdfs-site.xml vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<!--指定hdfs保存数据的副本数量-->
<property>
<name>dfs.replication</name>
<value>3</value>
<description>副本个数,配置默认是3,应小于datanode机器数量</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/var/hadoop/dfs/name</value>
<description>namenode上存储hdfs名字空间元数据 </description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/var/hadoop/dfs/data</value>
<description>datanode上数据块的物理存储位置</description>
</property>
</configuration>
10.修改mapred-site.xml (可能是.tmp之类的后缀,拷贝一份在改)cd /usr/local/hadoop/etc/hadoop/
<configuration>
<!--告诉hadoop以后MR运行在YARN上-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
11.修改vi yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.address</name>
<value>master:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:18088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:18141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
12.修改slaves
vi /usr/local/hadoop/etc/hadoop/slaves
删除localhost 添加节点主机名称
14.scp命令拷贝过去或者重新克隆 (记得该ip hostname ssh)
格式化namenode hadoop namenode -format
格式化后报错cd /var/hadoop/dfs/data/current/ 修改VERSON里的clusterid 全改了不要慌,删了重启节点就会自动生成
15.单一节点启动
hadoop-daemon.sh start datanode hadoop-daemon.sh start namenode
end.验证
输入hadoop,有提示信息则成功。
start-all.sh
start-yarn.sh
jps查看