hadoop配置、测试和实例
有转别人的,会标明
0 环境
- jdk1.8.0_221
- hadoop-2.7.7
- centos 7.7(NAT模式)
jdk1.8.0_191+hadoop-3.3.0不行
1 配置网络
NAT模式
参考我家zoxiii写的网络配置步骤👉传送门
vim /etc/sysconfig/network-scripts/ifcfg-ens33
虚拟网络IP
BOOTPROTO="static"
IPADDR=192.168.10.110
GATEWAY=192.168.10.2
NETMASK=255.255.255.0
DNS1=192.168.10.2
DNS2=114.114.114.114
修改hostname和hosts
vi /etc/hostname
vi /etc/hosts
- 重启网络
service network restart
查看hostname
hostname
hostname -i
2 卸载和安装JDK
- 查询
rpm -qa | grep java -i
- 删除
rpm -e --nodeps 查询到的java
- 安装
cd /root
mkdir /usr/local/src/jdk
cp jdk-8u221-linux-x64.tar.gz /usr/local/src/jdk/
rm -f jdk-8u221-linux-x64.tar.gz
cd /usr/local/src/jdk
tar -zxvf jdk-8u221-linux-x64.tar.gz
rm -f jdk-8u221-linux-x64.tar.gz
vim /etc/profile
export JAVA_HOME=/usr/local/src/jdk/jdk1.8.0_221
export PATH=$PATH:$JAVA_HOME/bin
- 环境立即生效
source /etc/profile
- 查看
java -version
4 配置hadoop
4.1 前提
- 安装
cd /root
mkdir /usr/local/src/hadoop
cp hadoop-2.7.7.tar.gz /usr/local/src/hadoop/
rm -f hadoop-2.7.7.tar.gz
cd /usr/local/src/hadoop
tar -zxvf hadoop-2.7.7.tar.gz
rm -f hadoop-2.7.7.tar.gz
- 创建文件夹
cd /usr/local/src/hadoop/hadoop-2.7.7
mkdir hdfs
mkdir tmp
cd hdfs
mkdir data
mkdir name
4.2 修改配置文件
4.3 配置环境
vim /etc/profile
export JAVA_HOME=/usr/local/src/jdk/jdk1.8.0_221
export HADOOP_HOME=/usr/local/src/hadoop/hadoop-2.7.7
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
source /etc/profile
- 查看hadoop版本
hadoop version
4.4 初始化HDFS
- 初始化HDFS
hdfs namenode -format
4.5 启动hadoop
- 启动Hadoop
cd /usr/local/src/hadoop/hadoop-2.7.7/sbin
./start-dfs.sh
./start-yarn.sh
- 查看进程
jps
6个进程!!!
- 关闭防火墙
firewall-cmd --state
systemctl stop firewalld.service
4.6 登录
192.168.10.110:50070 #hdfs
192.168.10.110:8088 #yarn
5 测试hadoop
6 hadoop实例:统计单词频数
6.1 编程实现
6.1.1 安装eclipse
mkdir /usr/local/src/eclipse
cd /root
cp eclipse-jee-kepler-SR2-linux-gtk-x86_64.tar.gz /usr/local/src/eclipse/
rm -f eclipse-jee-kepler-SR2-linux-gtk-x86_64.tar.gz
cd /usr/local/src/eclipse
tar -zxvf eclipse-jee-kepler-SR2-linux-gtk-x86_64.tar.gz
cd eclipse
# 然后运行eclipse
./eclipse
6.1.2 JAVA编程
参考大佬写的👉传送门
6.1.3 上传文件
- 上传文件wordTest.txt
hadoop fs -put wordTest.txt hdfs://localhost:9000/wordTest.txt
- 查看文件
hadoop fs -ls /
6.1.4 运行程序
- 运行WordCount.jar,将结果上传到/WCResult
time hadoop jar WordCount.jar [包名.]WordCount /wordTest.txt /WCResult
# hadoop前加上time可获得时间
- 查看结果
hadoop fs -ls /WCResult/
hadoop fs -cat /WCResult/part-r-00000
6.2 hadoop自带工具
- 查看之前已上传的文件
hadoop fs -ls /
hadoop fs -cat /wordTest.txt
- 找所需jar包
cd $HADOOP_HOME/share/hadoop/mapreduce
ls
- 运用找到的jar包
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount /wordTest.txt /WCResultAuto
- 查看结果
hadoop fs -ls /WCResultAuto/
hadoop fs -cat /WCResultAuto/part-r-00000
7 删除文件
hadoop fs -rm -r /要删除的文件
8 散记
- hadoop命令如果找不到,试一试:
./bin/hadoop
- 每次都要关闭防火墙,并启动Hadoop