============== Hadoop的安装 ==================
配置Hadoop环境变量
hadoop的版本:hadoop-2.6.4.tar.gz,rz命令上传Hadoop和jdk8到/opt/目录下
#vim /etc/profile.d/hadoop-eco.sh
JAVA_HOME=/opt/jdk8
PATH=$JAVA_HOME/bin:$PATH
HADOOP_HOME=/opt/hadoop PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
刷新配置
#source ~/.bash_profile
验证java环境变量
#java -version
解压
# tar -zxvf /opt/soft/hadoop-2.6.4.tar.gz -C /opt/
重命名
#mv hadoop-2.6.4/ hadoop
创建数据存储目录
1) NameNode 数据存放目录: /opt/hadoop-repo/name
2) SecondaryNameNode 数据存放目录: /opt/hadoop-repo/secondary
3) DataNode 数据存放目录: /opt/hadoop-repo/data
4) 临时数据存放目录: /opt/hadoop-repo/tmp
配置
/opt/hadoop/etc/hadoop/ 下
1)、配置hadoop-env.sh
export JAVA_HOME=/opt/jdk8
2)、配置yarn-env.sh
export JAVA_HOME=/opt/jdk8
配置hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///opt/hadoop-repo/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///opt/hadoop-repo/data</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///opt/hadoop-repo/secondary</value>
</property>
<!-- secondaryName http地址 -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<!-- 数据备份数量-->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<!-- 运行通过web访问hdfs-->
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<!-- 剔除权限控制-->
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
配置core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:///opt/hadoop-repo/tmp</value>
</property>
</configuration>
配置mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!-- 历史job的访问地址-->
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<!-- 历史job的访问web地址-->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
<property>
<name>mapreduce.map.log.level</name>
<value>INFO</value>
</property>
<property>
<name>mapreduce.reduce.log.level</name>
<value>INFO</value>
</property>
</configuration>
配置yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
</configuration>
格式化hadoop文件系统/验证配置
如果要进行多次格式化,那么需要将刚才创建的/opt/hadoop-repo/中的文件夹删除重建,才能进行二次格式化。
#hdfs namenode -format
#namenode和datenode下current/Version文件的clusterId要保持一致
启动hadoop
#start-all.sh
或分为以下
#start-dfs.sh
#start-yarn.sh
启动成功之后,通过java命令jps(java process status) -m会出现5个进程:
NameNode
SecondaryNameNode
DataNode
ResourceManager
NodeManager
验证
关闭防火墙
#systemctl stop firewalld
1°、在命令中执行以下命令:
#hdfs dfs -ls /
2°、在浏览器中输入http://master:50070
把文件存到hdfs
#hdfs dfs -put love.txt hdfs://master:9000/
查看hdfs文件内容
#hdfs dfs -text hdfs://master:9000/love.txt
hadoop地址
master:50070
hdfs地址
master:50070/explorer.html
查看job地址
master:8088/cluster
到Hadoop的share/Hadoop/mapreduce目录下测试wordcount(计算单词数)
#yarn jar hadoop-mapreduce-examples-2.6.4.jar wordcount hdfs://master:9000/love.txt hdfs://master:9000/out/mr/wc
控制台查看计算的结果
#hdfs dfs -ls hdfs://master:9000/out/mr/wc
控制台输出如下:
Found 2 items
-rw-r--r-- 1 root supergroup 0 2020-07-31 01:29 hdfs://master:9000/out/mr/wc/_SUCCESS
-rw-r--r-- 1 root supergroup 39 2020-07-31 01:29 hdfs://master:9000/out/mr/wc/part-r-00000 (计算结果)
查看计算结果
#hdfs dfs -text hdfs://master:9000/out/mr/wc/part-r-00000
查看hdfs磁盘使用量
#hdfs dfs -df -h / 或hdfs dfs -df /
查看hdfs文件详情
#hadoop fs -du -h /
==== Hadoop安装完成,下一章安装hive ====