Hadoop初试,已解决!

前言:Hadoop很重要,有很强的应用,上手学习一下。

目的:具备基本的Hadoop知识,包括所需组件的设置以及成功安装Hadoop集群

书籍:Hadoop权威指南-大数据的存储与分析(清华大学出版社)

blog:https://www.cnblogs.com/biehongli/p/7640469.html

http://hadoop.apache.org/docs/r3.0.0/index.html

 

第一步,目前存在3台Linux CentOS7 服务器,均可使用,首先使这3台服务器能够互相连通,ssh登录root用户。

#使用 ssh-keygen –t rsa 在服务器上生成密钥对,并将公钥放到其他服务器上。

#使用 cat 公钥名 >>authorized_keys 将公钥分别添加到文件 authorized_keys 中。

#尝试 ssh ip 登录时出现错误,如图

分析:权限不允许的错误,就是说网络上应该没问题,是通的,解决办法应该是开放相关配置权限。当使用ssh进行登录的时候,本机需要拿着自己的私钥,去与目标的密钥进行匹配,这个问题是由于没有在客户端配置好本服务器的私钥,所以相当于不拿钥匙就想去开门,门自然就开不开

解决:

-- 配置CentOS7 客户端私钥 vi /etc/ssh/ssh_config

-- 重启ssh服务

systemctl restart sshd.service

#分别配置 vi /etc/sysconfig/network 文件

#分别配置 vi /etc/hosts 文件

#重启三个服务器

#在各个服务器上 ping master slaver1 slaver2 测试

#使用 ssh 角色名 进行登录

 

第二步,在服务器上安装jdk

参考 https://blog.csdn.net/weixin_39139129/article/details/80434728 step3

 

第三步,在各服务器上创建hadoop用户

adduser hadoop
passwd hadoop

把hadoop用户加入到hadoop用户组

sudo usermod -a -G hadoop hadoop

把hadoop用户赋予root权限,让他可以使用sudo命令

vi /etc/sudoers

在 root ALL=(ALL) ALL 下加入

hadoop ALL=(ALL) ALL
:wq!

将hadoop文件夹的所有者指定为hadoop用户

chown -R hadoop:hadoop /opt/hadoop

 

第四步,使用解压hadoop安装包,并配置文件

#下载并解压安装包

#打开目录
cd /opt/hadoop
#下载安装包
wget http://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.7.6/hadoop-2.7.6.tar.gz
#解压
tar -zxvf hadoop-2.7.6.tar.gz

#配置环境变量文件

vi /etc/profile

#添加如下内容
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.6
export PATH=$PATH:$HADOOP_HOME/sbin
export PATH=$PATH:$HADOOP_HOME/bin

#没有启动Hadoop、hbase时都会有没加载lib成功的警告
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

source /etc/profile

#配置hadoop-env.shyarn-env.sh

vi /opt/hadoop/hadoop-2.7.6/etc/hadoop/hadoop-env.sh

#加入
export JAVA_HOME=/opt/java/jdk1.8.0_171

vi /opt/hadoop/hadoop-2.7.6/etc/hadoop/yarn-env.sh

#加入
export JAVA_HOME=/opt/java/jdk1.8.0_171

#配置core-site.xml

vi /opt/hadoop/hadoop-2.7.6/etc/hadoop/core-site.xml

#加入
<configuration>
    <property>
        <name>fs.defaultFS</name> <!--NameNode 的URI-->
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name> <!--hadoop临时文件的存放目录-->
        <value>/opt/hadoop/hadoop-2.7.6/temp</value>
    </property>
</configuration>

#配置hdfs-site.xml

vi /opt/hadoop/hadoop-2.7.6/etc/hadoop/hdfs-site.xml

#加入
<configuration>
    <property>
        <!--namenode持久存储名字空间及事务日志的本地文件系统路径-->
        <name>dfs.namenode.name.dir</name>
        <value>/opt/hadoop/hadoop-2.7.6/dfs/name</value>
        <!--目录无需预先创建,会自动创建-->
    </property>
    <property>
        <!--DataNode存放块数据的本地文件系统路径-->
        <name>dfs.datanode.data.dir</name>
        <value>/opt/hadoop/hadoop-2.7.6/dfs/data</value>
    </property>
    <property>
        <!--数据需要备份的数量,不能大于集群的机器数量,默认为3-->
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>master:9001</value>
    </property>
    <property>
        <!--设置为true,可以在浏览器中IP+port查看-->
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>
</configuration>

#配置mapred-site.xml

cp /opt/hadoop/hadoop-2.7.6/etc/hadoop/mapred-site.xml.template /opt/hadoop/hadoop-2.7.6/etc/hadoop/mapred-site.xml
vi /opt/hadoop/hadoop-2.7.6/etc/hadoop/mapred-site.xml

<configuration>
    <property>
        <!--mapreduce运用了yarn框架,设置name为yarn-->
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <!--历史服务器,查看Mapreduce作业记录-->
        <name>mapreduce.jobhistory.address</name>
        <value>master:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>master:19888</value>
    </property>
</configuration>

#配置yarn-site.xml

vi /opt/hadoop/hadoop-2.7.6/etc/hadoop/yarn-site.xml

<configuration>
    <property>
        <!--NodeManager上运行的附属服务,用于运行mapreduce-->
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <!--ResourceManager 对客户端暴露的地址-->
        <name>yarn.resourcemanager.address</name>
        <value>master:8032</value>
    </property>
    <property>
        <!--ResourceManager 对ApplicationMaster暴露的地址-->
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>master:8030</value>
    </property>
    <property>
        <!--ResourceManager 对NodeManager暴露的地址-->
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master:8031</value>
    </property>
    <property>
        <!--ResourceManager 对管理员暴露的地址-->
        <name>yarn.resourcemanager.admin.address</name>
        <value>master:8033</value>
    </property>
    <property>
        <!--ResourceManager 对外web暴露的地址,可在浏览器查看-->
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
    </property>
</configuration>

#配置slaves文件

vi /opt/hadoop/hadoop-2.7.6/etc/hadoop/slaves

#注释掉
localhost

#加入
slaver1
slaver2

 

#配置结束,通过远程复制命令将master上配置好的Hadoop复制到slaver1和slaver2上

scp -r /opt/hadoop/hadoop-2.7.6 slaver1:/opt/hadoop/hadoop-2.7.6
scp -r /opt/hadoop/hadoop-2.7.6 slaver2:/opt/hadoop/hadoop-2.7.6

 

第五步,启动Hadoop

#在Hadoop目录中,进行hdfs格式化

bin/hdfs namenode -format

#启动

sbin/start-all.sh

#查看master上jps

#查看slaver上jps​​​​​​​

#停止

sbin/stop-all.sh

 

question:master服务器上jps查看出的结果不对

查看一下hadoop-root-namenode-x.log日志

2018-07-27 11:25:21,493 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Need to save fs image? false (staleImage=false, haEnabled=false, isRollingUpgrade=false)
2018-07-27 11:25:21,494 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 1
2018-07-27 11:25:21,609 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 0 entries 0 lookups
2018-07-27 11:25:21,609 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 317 msecs
2018-07-27 11:25:21,746 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: RPC server is binding to master:9000
2018-07-27 11:25:21,752 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 1000
2018-07-27 11:25:21,762 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for active state
2018-07-27 11:25:21,763 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 1
2018-07-27 11:25:21,764 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 1 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 11 
2018-07-27 11:25:21,765 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /opt/hadoop/hadoop-2.7.6/dfs/name/current/edits_inprogress_0000000000000000001 -> /opt/hadoop/hadoop-2.7.6/dfs/name/current/edits_0000000000000000001-0000000000000000002
2018-07-27 11:25:21,783 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for active state
2018-07-27 11:25:21,783 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state
2018-07-27 11:25:21,788 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:50070
2018-07-27 11:25:21,790 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2018-07-27 11:25:21,791 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2018-07-27 11:25:21,791 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2018-07-27 11:25:21,796 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.net.BindException: Problem binding to [master:9000] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
	at org.apache.hadoop.ipc.Server.bind(Server.java:484)
	at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:690)
	at org.apache.hadoop.ipc.Server.<init>(Server.java:2379)
	at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:951)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:534)
	at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509)
	at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:796)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.<init>(NameNodeRpcServer.java:351)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:675)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:648)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:820)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:804)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1516)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1582)
Caused by: java.net.BindException: Cannot assign requested address
	at sun.nio.ch.Net.bind0(Native Method)
	at sun.nio.ch.Net.bind(Net.java:433)
	at sun.nio.ch.Net.bind(Net.java:425)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
	at org.apache.hadoop.ipc.Server.bind(Server.java:467)
	... 13 more
2018-07-27 11:25:21,799 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2018-07-27 11:25:21,802 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at iz2zeeiutul2gfdixjzqcoz/172.17.5.165
************************************************************/

查看报错及结合网络搜索信息,发现master:9000这个地址可能不好用,遂重新配置 core-site.xml

vi /opt/hadoop/hadoop-2.7.6/etc/hadoop/core-site.xml

<value>hdfs://master:9000</value>
-- >
<value>hdfs://127.0.0.1:9000</value>

最后配置后依旧无法启动master上的

ResourceManager、NameNode、SecondaryNameNode进程

盖以为是因为3个服务器并非内网架构,无法通过局域网互通的原因,下一步尝试在虚拟机上架设伪分布式。

 

在经过第二次在虚拟机上尝试后,我本打算跑个单机版的hadoop凑合使用,结果我看到这样一篇文章,并认为可能是由于我的机器未设置对机器的hostname导致的hadoop跑不起来,于是我在设置了hadoop的hostname后又按照前面步骤重新部署后,好用了。

#CentOS7 设置hostname

hostnamectl set-hostname master

#编辑下hosts文件

vim /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 master
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

 

/*

第四步,使用解压hadoop安装包,并配置文件(失败了!废弃)

#将下载的安装包解压到目录 /opt/hadoop 下

tar -zxvf hadoop-3.1.0.tar.gz

#配置环境变量

vi /etc/profile

#set hadoop
export HADOOP_HOME=/opt/hadoop/hadoop-3.1.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:

source /etc/profile

 

#配置文件

cd /opt/hadoop/hadoop-3.1.0/etc/hadoop

#配置文件:hadoop-env.sh

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/opt/java/jdk1.8.0_171
export HADOOP_HOME=/opt/hadoop/hadoop-3.1.0

#配置 hdfs-site.xml

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/hdfs-site.xml

<configuration>
        <property>
                <!--这里配置逻辑名称,可以随意写 -->
                <name>dfs.nameservices</name>
                <value>hbzx</value>
        </property>
        <property>
                <!-- 禁用权限 -->
                <name>dfs.permissions.enabled</name>
                <value>false</value>
        </property>
        <property>
                <!-- 配置namenode 的名称,多个用逗号分割  -->
                <name>dfs.ha.namenodes.hbzx</name>
                <value>nn1,nn2</value>
        </property>
        <property>
                <!-- dfs.namenode.rpc-address.[nameservice ID].[name node ID] namenode 所在服务器名称和RPC监听端口号  -->
                <name>dfs.namenode.rpc-address.hbzx.nn1</name>
                <value>master:9820</value>
        </property>
        <property>
                <!-- dfs.namenode.rpc-address.[nameservice ID].[name node ID] namenode 所在服务器名称和RPC监听端口号  -->
                <name>dfs.namenode.rpc-address.hbzx.nn2</name>
                <value>slaver1:9820</value>
        </property>
        <property>
                <!-- dfs.namenode.http-address.[nameservice ID].[name node ID] namenode 监听的HTTP协议端口 -->
                <name>dfs.namenode.http-address.hbzx.nn1</name>
                <value>master:9870</value>
        </property>
        <property>
                <!-- dfs.namenode.http-address.[nameservice ID].[name node ID] namenode 监听的HTTP协议端口 -->
                <name>dfs.namenode.http-address.hbzx.nn2</name>
                <value>slaver1:9870</value>
        </property>

        <property>
                <!-- namenode 共享的编辑目录, journalnode 所在服务器名称和监听的端口 -->
                <name>dfs.namenode.shared.edits.dir</name>
                <value>qjournal://master:8485;slaver1:8485;slaver2:8485/hbzx</value>
        </property>

        <property>
                <!-- namenode高可用代理类 -->
                <name>dfs.client.failover.proxy.provider.hbzx</name>
                <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        </property>

        <property>
                <!-- 使用ssh 免密码自动登录 -->
                <name>dfs.ha.fencing.methods</name>
                <value>sshfence</value>
        </property>

        <property>
                <name>dfs.ha.fencing.ssh.private-key-files</name>
                <value>/root/.ssh/master</value>
        </property>

        <property>
                <!-- journalnode 存储数据的地方 -->
                <name>dfs.journalnode.edits.dir</name>
                <value>/opt/data/journal/node/local/data</value>
        </property>

        <property>
                <!-- 配置namenode自动切换 -->
                <name>dfs.ha.automatic-failover.enabled</name>
                <value>true</value>
        </property>

</configuration>

#配置 core-site.xml

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/core-site.xml

<configuration>
    <property>
        <!-- 为Hadoop 客户端配置默认的高可用路径  -->
        <name>fs.defaultFS</name>
        <value>hdfs://hbzx</value>
    </property>
    <property>
        <!-- Hadoop 数据存放的路径,namenode,datanode 数据存放路径都依赖本路径,不要使用file:/ 开头,使用绝对路径即可
        namenode 默认存放路径 :file://${hadoop.tmp.dir}/dfs/name
        datanode 默认存放路径 :file://${hadoop.tmp.dir}/dfs/data
        -->
        <name>hadoop.tmp.dir</name>
        <value>/opt/data/hadoop/</value>
    </property>

    <property>
        <!-- 指定zookeeper所在的节点 -->
        <name>ha.zookeeper.quorum</name>
        <value>master:2181,slaver1:2181,slaver2:2181</value>
    </property>
</configuration>

#配置 yarn-site.xml 为单节点默认

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/yarn-site.xml

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>

    <property>
        <!-- 配置yarn为高可用 -->
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <property>
        <!-- 集群的唯一标识 -->
        <name>yarn.resourcemanager.cluster-id</name>
        <value>hbzx</value>
    </property>
    <property>
        <!--  ResourceManager ID -->
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>
    <property>
        <!-- 指定ResourceManager 所在的节点 -->
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>master</value>
    </property>
    <property>
        <!-- 指定ResourceManager 所在的节点 -->
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>slaver1</value>
    </property>
    <property>
        <!-- 指定ResourceManager Http监听的节点 -->
        <name>yarn.resourcemanager.webapp.address.rm1</name>
        <value>master:8088</value>
    </property>
    <property>
        <!-- 指定ResourceManager Http监听的节点 -->
        <name>yarn.resourcemanager.webapp.address.rm2</name>
        <value>slaver1:8088</value>
    </property>
    <property>
        <!-- 指定zookeeper所在的节点 -->
        <name>yarn.resourcemanager.zk-address</name>
        <value>master:2181,slaver1:2181,slaver2:2181</value>
    </property>

    <property>
        <!-- 启用节点的内容和CPU自动检测,最小内存为1G -->
        <name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
        <value>true</value>
    </property>
</configuration>

#配置 mapred-site.xml

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/mapred-site.xml


<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

 

#配置后启动未成功,回滚,重新进行第四步

#配置 core-site.xml

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/core-site.xml

<configuration>
  <property>
    <name>fs.defaultFS</name>   
     <value>hdfs://master:9000</value>  
  </property>
  <property>
      <name>hadoop.tmp.dir</name>  
      <value>file:///opt/hadoop/hadoop-3.1.0/tmp</value>  
  </property>
</configuration>

#配置 hdfs-site.xml

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/hdfs-site.xml

<configuration>
 <property>
   <name>dfs.replication</name>
   <value>2</value>
 </property>
 <property>
   <name>dfs.namenode.name.dir</name>
   <value>file:///opt/hadoop/hadoop-3.1.0/hdfs/name</value>
 </property>
 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:///opt/hadoop/hadoop-3.1.0/hdfs/data</value>
 </property>
 <property>
   <name>dfs.namenode.secondary.http-address</name>
   <value>slaver1:9001</value>
 </property>
</configuration>

#配置 yarn-site.xml

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/yarn-site.xml

<configuration>
<!-- Site specific YARN configuration properties -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master:8025</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>master:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>master:8040</value>
    </property>
</configuration>

#配置 mapred-site.xml

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/mapred-site.xml

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

<property>
 <name>mapreduce.application.classpath</name>
 <value>
  /opt/hadoop/hadoop-3.1.0/etc/hadoop,
  /opt/hadoop/hadoop-3.1.0/share/hadoop/common/*,
  /opt/hadoop/hadoop-3.1.0/share/hadoop/common/lib/*,
  /opt/hadoop/hadoop-3.1.0/share/hadoop/hdfs/*,
  /opt/hadoop/hadoop-3.1.0/share/hadoop/hdfs/lib/*,
  /opt/hadoop/hadoop-3.1.0/share/hadoop/mapreduce/*,
  /opt/hadoop/hadoop-3.1.0/share/hadoop/mapreduce/lib/*,
  /opt/hadoop/hadoop-3.1.0/share/hadoop/yarn/*,
  /opt/hadoop/hadoop-3.1.0/share/hadoop/yarn/lib/*
 </value>
</property>
</configuration>

#配置 workers

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/workers

#填入三台机器ip地址
39.106.4.66
101.132.236.106
39.106.27.129

#配置 hadoop-env.sh yarn-env.sh 的 Java_Home

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/opt/java/jdk1.8.0_171

vi /opt/hadoop/hadoop-3.1.0/etc/hadoop/yarn-env.sh

export JAVA_HOME=/opt/java/jdk1.8.0_171

 

第五步,启动hadoop

1. 格式化namenode

进入bin目录

./hdfs namenode -format

 

2. 启动hadoop

进入sbin目录

./start-dfs.sh

报错:

ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.

把缺少的环境变量加上:hadoop-env.sh

export HDFS_DATANODE_SECURE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root

 

再启动

报错:

ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.

start-dfs.shstop-dfs.sh两个文件顶部添加以下参数

#!/usr/bin/env bash
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

start-yarn.shstop-yarn.sh顶部也需添加以下

#!/usr/bin/env bash
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

ERROR: Cannot set priority of datanode process 10539

由于新报错没有明确方向,网上部署案例多为hadoop2.x版本,所以退回hadoop2.X版本。回滚到第四步

./start-yarn.sh

*/

 

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值