搭建Hadoop环境(user用户)
文章目录
1、安装配置JDK
1、下载 JDK-1.8.0 版本
wget https://mirrors.tuna.tsinghua.edu.cn/AdoptOpenJDK/8/jdk/x64/linux/OpenJDK8U-jdk_x64_linux_openj9_linuxXL_8u282b08_openj9-0.24.0.tar.gz
2、解压 JDK-1.8.0 安装包
tar -zxvf OpenJDK8U-jdk_x64_linux_openj9_linuxXL_8u282b08_openj9-0.24.0.tar.gz
3、切换到root用户,移动并重命名JDK包
sudo -s
mv jdk8u282-b08/ /usr/java8
4、配置Java环境变量
echo 'export JAVA_HOME=/usr/java8' >> /etc/profile
echo 'export PATH=$PATH:$JAVA_HOME/bin' >> /etc/profile
source /etc/profile
5、查看Java版本信息:
java -version
返回如下信息,则安装成功
openjdk version "1.8.0_242" OpenJDK Runtime Environment (build 1.8.0_242-b08) OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode)
2、安装Hadoop
1、下载 Hadoop-2.10.1 安装包
wget http://mirrors.ustc.edu.cn/apache/hadoop/common/hadoop-2.10.1/hadoop-2.10.1.tar.gz
2、解压 Hadoop-2.10.1 安装包至/opt/hadoop
tar -zxvf hadoop-2.10.1.tar.gz -C /opt/
mv /opt/hadoop-2.10.1 /opt/hadoop
3、配置Hadoop环境变量
echo 'export HADOOP_HOME=/opt/hadoop/' >> /etc/profile
echo 'export PATH=$PATH:$HADOOP_HOME/bin' >> /etc/profile
echo 'export PATH=$PATH:$HADOOP_HOME/sbin' >> /etc/profile
source /etc/profile
4、修改配置文件 yarn-env.sh 及 hadoop-env.sh
echo "export JAVA_HOME=/usr/java8" >> /opt/hadoop/etc/hadoop/yarn-env.sh
echo "export JAVA_HOME=/usr/java8" >> /opt/hadoop/etc/hadoop/hadoop-env.sh
5、查看Hadoop版本信息
hadoop version
返回如下信息则安装成功
Hadoop 2.10.1 Subversion https://github.com/apache/hadoop -r 1827467c9a56f133025f28557bfc2c562d78e816 Compiled by centos on 2020-09-14T13:17Z Compiled with protoc 2.5.0 From source with checksum 3114edef868f1f3824e7d0f68be03650 This command was run using /opt/hadoop/share/hadoop/common/hadoop-common-2.10.1.jar
3、配置Hadoop
Ⅰ、修改Hadoop配置文件 core-site.xml
vim /opt/hadoop/etc/hadoop/core-site.xml
按i键进入输入模式
<configuration></configuration>
字段内插入如下内容<property> <name>hadoop.tmp.dir</name> <value>file:/opt/hadoop/tmp</value> <description>location to store temporary files</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property>
按Esc键,再按:键,输入wq回车
保存并退出
Ⅱ、修改Hadoop配置文件 hdfs-site.xml
vim /opt/hadoop/etc/hadoop/hdfs-site.xml
按i键进入输入模式
<configuration></configuration>
字段内插入如下内容<property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/opt/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/opt/hadoop/tmp/dfs/data</value> </property>
按Esc键,再按:键,输入wq回车
保存并退出
4、配置SSH免密登录
1、ssh-keygen产生公钥私钥对(回车到底)
ssh-keygen -t rsa
2、将公钥复制到主机上(这里需要一段时间,不要着急,稍等一会儿~)
ssh-copy-id -i ~/.ssh/id_rsa.pub Hadoop
3、导入公钥
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
4、scp 将本机的id_rsa.pub复制到对方机器的.ssh目录下
scp ~/.ssh/authorized_keys root@localhost:~/.ssh/
(提示Are you sure you want to continue connecting (yes/no)? )
(输入yes回车)
5、.ssh目录及/home/当前用户 需要 700权限
chmod 700 /root/.ssh
6、.ssh目录下的authorized_keys文件 需要 600 或 644权限
chmod 600 /root/.ssh/authorized_keys
7、重启
service sshd restart
8、免密登陆
ssh localhost
5、启动Hadoop
Ⅰ、初始化namenode
hadoop namenode -format
Ⅱ、启动Hadoop
start-dfs.sh
注意代码提示:Are you sure you want to continue connecting (yes/no)?
输入yes回车start-yarn.sh
Ⅲ、查看成功启动的进程
jps
打开浏览器查看Hadoop伪分布式环境搭建是否完成 :
http://localhost:8088 http://localhost:50070
6、缺少NameNode
原因分析:
namenode 默认在/tmp下建立临时文件,但关机后,/tmp下文档自动删除,再次启动Master造成文件不匹配,所以namenode启动失败。
解决方案:
在core-site.xml中指定临时文件位置,然后重新格式化,value中的路径只要不是/tmp 就行
1、停止Hadoop集群stop-all.sh
2、修改Hadoop配置文件
core-site.xml
vim /opt/hadoop/etc/hadoop/core-site.xml
将以下字段中:
<name>hadoop.tmp.dir</name> <value>file:/opt/hadoop/tmp</value>
value
字段内容修改为:
<value>/usr/grid/hadoop1.7.0_17/hadoop_${user.name}</value>
3、初始化namenodehadoop namenode -format
4、启动Hadoop集群
start-all.sh
5、查看成功启动的进程
jps