组件版本
JDK-8.x(最好不要超过9)
hadoop-3.12
zookeeper-3.49
两台主机:master slave
增加用户
sudo useradd -m hadoop -s /bin/bash #增加用户
sudo passwd hadoop #修改密码
sudo adduser hadoop sudo #管理员权限
修改host
hadoop@master:$vim /etc/hostname
master
hadoop@slave:$vim /etc/hostname
slave
多台主机都要配置
vim /etc/hosts
127.0.0.1 localhost
192.168.48.137 master
192.168.48.143 slave
ps:localhost 只能保留这一行
免密登陆
两台主机都要执行以下步骤
安装ssh
apt-get install openssh-server
生成公钥和私钥
ssh-keygen -t rsa
将两台主机公钥拷贝到master
ssh-copy-id master
将master生成的公钥复制到slave
scp ~/.ssh/authorized_keys hadoop@slave:~/.ssh/
相互验证是否免密登录成功
ssh master
ssh slave
安装jdk、zookeeper、hadoop
不再多说,自行配置相应的路径添加环境变量即可
配置zookeeper
cd $ZK_HOME/conf #进入zookeeper配置环境
cp zoo_sample.cfg zoo.cfg
mkdir -p /usr/local/zookeeper/zkdatas/
vim zoo.cfg
dataDir=/usr/local/zookeeper/zkdatas
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
server.1=master:2888:3888
server.2=slave:2888:3888
hadoop@master:$ echo 1 > /usr/local/zookeeper/zkdatas/myid
hadoop@slave:$ echo 2 > /usr/local/zookeeper/zkdatas/myid
启动zookeeper
master 和 slave都要执行
$ZK_HOME/bin/zkServer.sh start
查看状态
$ZK_HOME/binzkServer.sh status
出现follower或leader则启动正确
需要至少两个节点启动完毕,status才会正常显示
否则报错
配置hadoop3.x分布式
修改hadoop-env.sh
export JAVA_HOME=/usr/local/jvm
修改core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>hadoop</value>
</property>
</configuration>
修改hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
修改mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
修改worker
master
slave
修改yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
创建文件
mkdir -p /usr/local/hadoop/datas/tmp
mkdir -p /usr/local/hadoop/datas/dfs/nn/snn/edits
mkdir -p /usr/local/hadoop/datas/namenode/namenodedatas
mkdir -p /usr/local/hadoop/datas/datanode/datanodeDatas
mkdir -p /usr/local/hadoop/datas/dfs/nn/edits
mkdir -p /usr/local/hadoop/datas/dfs/snn/name
mkdir -p /usr/local/hadoop/datas/jobhsitory/intermediateDoneDatas
mkdir -p /usr/local/hadoop/datas/jobhsitory/DoneDatas
mkdir -p /usr/local/hadoop/datas/nodemanager/nodemanagerDatas
mkdir -p /usr/local/hadoop/datas/nodemanager/nodemanagerLogs
mkdir -p /usr/local/hadoop/datas/remoteAppLog/remoteAppLogs
格式化并启动
bin/hdfs namenode -format
如果在输出中出现has been successfully则格式化成功
启动hdfs
sbin/start-dfs.sh
查看jps:
master应包含
jps
namenode
secondarynamenode
QuorumPeerMain
DataNode
slave应包含
jps
QuorumPeerMain
DataNode
启动yarn
start-yarn.sh
查看jps
集群配置完成
验证配置完成
hdfs dfsadmin -report
查看datanode信息
一些坑
1、参照别人配置的时候一定要仔细观察路径、主机名及用户名
2、Java不要用9以上版本,运行yarn会报错
3、防火墙一定要关闭
4、云环境搭建要配置安全组端口才能启动成功