Hadoop伪分布式环境搭建
设置本机免密登录
- 生成密钥和私钥
ssh-keygen
- 将公钥拷贝到本机
ssh-copy-id hadoop102(本机主机名或ip 主机名下文说明)
3.测试
·
ssh hadoop102
[cgd@hadoop102 hadoop]$ ssh hadoop102
Last login: Fri Sep 25 13:00:35 xxx from xx.xx.xx.xx
[cgd@hadoop102 ~]$
不需要输入密码就说明配置成功
反之 输入密码则配置失败
JDK 环境变量
- 卸载系统JDK
查看 rpm –qa|grep jdk
或者 yum list installed | grep jdkl
卸载 rpm remove jdk1.8-1.8.0_221-fcs.x86_64
或者yum -y remove .......
- 安装
yum安装
查看JDK软件包列表
yum search java | grep -i --color jdk
选择版本安装
yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel
#或者如下命令,安装jdk1.8.0的所有文件
yum install -y java-1.8.0-openjdk*
JDK默认安装路径/usr/lib/
rpm 安装
tar -zxvf jdk-1.8.0_231.gz –C /opt/module
- 设置环境变量
vim /etc/profile(普通用户则是 ~/.bash_profile)
export JAVA_HOME=/opt/module/jdk
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH:$HOME/bin
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
- 验证
[cgd@hadoop102 lib]$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
配置主机名
vim /etc/sysconfig/network
修改或添加下面两行代码
NETWORKING=yes # 启动网络
HOSTNAME=hadoop102 # 主机名
vim /etc/hosts
ip hadoop102
临时修改生效
Hostname hadoop102
Hostname 查看当前主机名
reboot(shutdown -r now)重启
查看主机名 hostname
[cgd@hadoop102 lib]$ hostname
hadoop102
下载hadoop并解压
本文采用hadoop-2.7.1版本,可去下载
tar -zxf /opt/software/hadoop-2.7.1_64bit.tar.gz -C /opt/module/
配置hadoop环境变量
vim ~/.bash_profile (普通用户)
末尾放入以下代码
export HADOOP_HOME=/opt/module/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
配置HDFS
进入目录 /opt/module/hadoop/etc/hadoop(下文配置文件都在此目录下)
vim hadoop-env.sh
vim core-site.xml
<configuration>
<!-- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop102:9000</value>
</property>
<!-- 指定Hadoop运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop/tmp</value>
</property>
</configuration>
vim hdfs-site.xml
<!-- 指定HDFS副本的数量 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
格式化
hadoops namenode -format
启动HDFS
启动namenode
hadoop-daemon.sh start namenode
启动datanode
hadoop-daemon.sh start datanod
也可以直接启动
start-dfs.sh
验证 jps
13586 NameNode
13668 DataNode
13786 Jps
浏览器中打开
http://hadoop102:50070
yarn
vim yarn-site.xml
<configuration>
<!-- Reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 指定YARN的ResourceManager的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop102</value>
</property>
</configuration>
cp mapred-site.xml.template mapred-site.xml
vim mapred-site.xml
<configuration>
<!-- 指定MR运行在YARN上 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
启动yarn
(a)启动前必须保证NameNode和DataNode已经启动
(b)启动ResourceManager
yarn-daemon.sh start resourcemanager
(c)启动NodeManager
yarn-daemon.sh start nodemanager
或者 start-yarn.sh
yarn浏览器端查看
hadoop102:8088
执行Mapreduce程序
上传数据到HDFS分布式文件系统
[cgd@hadoop102 data]$ cat input
If you wish to succeed, you should use persistence as your good friend,
experience as your reference, prudence as your brother and hope as your sentry.
[cgd@hadoop102 data]$
运行程序
[cgd@hadoop102 data]$ hadoop fs -put input /data/input
[cgd@hadoop102 data]$ hadoop jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /data/input /data/output
查看结果
[cgd@hadoop102 data]$ hadoop fs -cat /data/output/*
If 1
and 1
as 4
brother 1
experience 1
friend, 1
good 1
hope 1
persistence 1
prudence 1
reference, 1
sentry. 1
should 1
succeed, 1
to 1
use 1
wish 1
you 2
your 4
yarn客户端查看
`