open-falcon hadoop 搭建

目的

部署 hadoop

说明

Hadoop是一个用Java编写的Apache开源框架
允许使用简单的编程模型跨计算机集群分布式处理大型数据集
Hadoop框架工作的应用程序在跨计算机集群提供分布式存储和计算的环境中工作
Hadoop旨在从单个服务器扩展到数千个机器,每个都提供本地计算和存储

参考

hadoop 介绍

软件下载

官方 apache hadoop 下载地址

服务器与角色

角色服务器服务器服务器配置
ns-yun-020022ns-yun-020023ns-yun-020024
zookeeperYYY
nameNodeYYhdfs-site.xml
dataNodeYYY
resourceManagerYYyarn-site.xml
nodeManagerYYY
journamNodeYYY
zkFCYY

安装

解压

tar xf /usr/src/falcon-src/hadoop/hadoop-2.10.0.tar.gz -C /apps/svr/
ln -s hadoop-2.10.0 hadoop
mkdir -p /apps/svr/hadoop/tmp  /apps/svr/hadoop/hadoopdata/dfs/{name,data,journaldata}

系统环境

HADOOP_HOME=/apps/svr/hadoop/
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$ZOOKEEPER_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export  JAVA_HOME JRE_HOME CLASS_PATH  PATH USER LOGNAME MAIL HOSTNAME HISTSIZE HISTCONTROL ZOOKEEPER_HOME HADOOP_HOME

建议在每个机器上 /etc/hosts 中添加 IP 主机名对应信息
就算有 DNS 服务器支持也建议添加 IP 域主机名字信息 example

1.1.1.1    ns-yun-020022.133.com
1.1.1.2    ns-yun-020023.133.com
1.1.1.3    ns-yun-020024.133.com

否则启动 HADOOP 时会遇到一个 ip=1.1.1.1 hostname=1.1.1.1 及 hadoop datanode Datanode denied communication with namenode because hostname cannot be resolved 错误的错误

hadoop 配置

JAVA 环境, 修改 /apps/svr/hadoop-2.10.0/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/apps/svr/java

修改集群主配置 /apps/svr/hadoop/etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://ha01/</vlue>
        </property>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/apps/svr/hadoop/tmp</value>
        </property>
        <property>
                <name>ha.zookeeper.quorum</name>
                <value>ns-yun-020022.133.com:2181,ns-yun-020023.133.com:2181,ns-yun-020024.133.com:2181</value>
        </property>
        <property>
                <name>ha.zookeeper.session-timeout.ms</name>
                <value>1000</value>
                <description>ms</description>
        </property>
</configuration>

配置 /apps/svr/hadoop/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
                <name>dfs.replication</name>
                <value>2</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>/apps/svr/hadoop/hadoopdata/dfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/apps/svr/hadoop/hadoopdata/dfs/data</value>
        </property>
        <property>
                <name>dfs.webhdfs.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>dfs.nameservices</name>
                <value>ha01</value>
        </property>
        <property>
                <name>dfs.ha.namenodes.ha01</name>
                <value>nn1,nn2</value>
        </property>
        <property>
                <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
                <value>false</value>
        </property>
        <property>
                <name>dfs.namenode.rpc-address.ha01.nn1</name>
                <value>ns-yun-020024.133.com:9000</value>
        </property>
        <property>
                <name>dfs.namenode.http-address.ha01.nn1</name>
                <value>ns-yun-020024.133.com:50070</value>
        </property>
        <property>
                <name>dfs.namenode.rpc-address.ha01.nn2</name>
                <value>ns-yun-020023.133.com:9000</value>
        </property>
        <property>
                <name>dfs.namenode.http-address.ha01.nn2</name>
                <value>ns-yun-020023.133.com:50070</value>
        </property>
        <property>
                <name>dfs.namenode.shared.edits.dir</name>
                <value>qjournal://ns-yun-020022.133.com:8485;ns-yun-020023.133.com:8485;ns-yun-020024.133.com:8485/ha01</value>
        </property>
        <property>
                <name>dfs.journalnode.edits.dir</name>
                <value>/apps/svr/hadoop/hadoopdata/dfs/journaldata</value>
        </property>
        <property>
                <name>dfs.ha.automatic-failover.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>dfs.client.failover.proxy.provider.ha01</name>
                <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        </property>
        <property>
                <name>dfs.ha.fencing.methods</name>
                <value>
                        sshfence
                        shell(/bin/true)
                </value>
        </property>
        <property>
                <name>dfs.ha.fencing.ssh.private-key-files</name>
                <value>/home/apps/.ssh/id_rsa</value>
        </property>
        <property>
                <name>dfs.ha.fencing.ssh.connect-timeout</name>
                <value>30000</value>
        </property>
        <property>
                <name>ha.failover-controller.cli-check.rpc-timeout.ms</name>
                <value>60000</value>
        </property>
</configuration>

/apps/svr/hadoop/etc/hadoop/mapred-site.xml
需要在执行 job 的服务器上配置

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>ns-yun-020022.133.com:10020</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>>ns-yun-020022.133.com:19888</value>
        </property>
</configuration>

配置 yarn 服务 /apps/svr/hadoop/etc/hadoop/yarn-site.xml

<?xml version="1.0"?>

<configuration>
        <property>
                <name>yarn.resourcemanager.ha.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>yarn.resourcemanager.cluster-id</name>
                <value>yrc</value>
        </property>
        <property>
                <name>yarn.resourcemanager.ha.rm-ids</name>
                <value>rm1,rm2</value>
        </property>
        <property>
                <name>yarn.resourcemanager.hostname.rm1</name>
                <value>ns-yun-020024.133.com</value>
        </property>
        <property>
                <name>yarn.resourcemanager.hostname.rm2</name>
                <value>ns-yun-020023.133.com</value>
        </property>
        <property>
                <name>yarn.resourcemanager.zk-address</name>
                <value>ns-yun-020022.133.com:2181,ns-yun-020023.133.com:2181,ns-yun-020024.133.com:2181</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                <name>yarn.log-aggregation-enable</name>
                <value>true</value>
        </property>
        <property>
                <name>yarn.log-aggregation.retain-seconds</name>
                <value>86400</value>
        </property>
        <property>
                <name>yarn.resourcemanager.recovery.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>yarn.resourcemanager.store.class</name>
                <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
        </property>
</configuration>

/apps/svr/hadoop/etc/hadoop/slaves

ns-yun-020024.133.com
ns-yun-020023.133.com
ns-yun-020022.133.com

启动集群

检测当前 zookeeper 连接
正常集群中每个服务器都启动了下面的服务

[root@ns-yun-020022 hadoop]# jps
144379 QuorumPeerMain
152910 Jps

启动 jounalNode ( 22,23,24 都需要执行)

[root@ns-yun-020022 hadoop]# hadoop-daemon.sh start journalnode
starting journalnode, logging to /apps/svr/hadoop-2.10.0/logs/hadoop-root-journalnode-ns-yun-020022.133.com.out

[root@ns-yun-020022 hadoop]# jps
153046 Jps
152967 JournalNode    <--- 这里启动了
144379 QuorumPeerMain

格式化 nameNode 只需要在一个节点上执行

[root@ns-yun-020022 hadoop]# hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
20/08/14 15:59:59 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ns-yun-020022.133.com/隐藏一下
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.10.0
STARTUP_MSG:   classpath = 忽略
STARTUP_MSG:   build = ssh://git.corp.linkedin.com:29418/hadoop/hadoop.git -r e2f1f118e465e787d8567dfa6e2f3b72a0eb9194; compiled by 'jhung' on 2019-10-22T19:10Z
STARTUP_MSG:   java = 1.8.0_261
************************************************************/
20/08/14 15:59:59 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
20/08/14 15:59:59 INFO namenode.NameNode: createNameNode [-format]
20/08/14 16:00:00 INFO common.Util: Assuming 'file' scheme for path /apps/svr/hadoop/hadoopdata/dfs/name in configuration.
20/08/14 16:00:00 INFO common.Util: Assuming 'file' scheme for path /apps/svr/hadoop/hadoopdata/dfs/name in configuration.
Formatting using clusterid: CID-cc847283-f796-43c5-ad7f-e8aa8a2a73a7
20/08/14 16:00:00 INFO namenode.FSEditLog: Edit logging is async:true
20/08/14 16:00:00 INFO namenode.FSNamesystem: KeyProvider: null
20/08/14 16:00:00 INFO namenode.FSNamesystem: fsLock is fair: true
20/08/14 16:00:00 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
20/08/14 16:00:00 INFO namenode.FSNamesystem: fsOwner             = root (auth:SIMPLE)
20/08/14 16:00:00 INFO namenode.FSNamesystem: supergroup          = supergroup
20/08/14 16:00:00 INFO namenode.FSNamesystem: isPermissionEnabled = true
20/08/14 16:00:00 INFO namenode.FSNamesystem: Determined nameservice ID: ha01
20/08/14 16:00:00 INFO namenode.FSNamesystem: HA Enabled: true
20/08/14 16:00:00 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
20/08/14 16:00:00 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
20/08/14 16:00:00 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
20/08/14 16:00:00 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
20/08/14 16:00:00 INFO blockmanagement.BlockManager: The block deletion will start around 2020 Aug 14 16:00:00
20/08/14 16:00:00 INFO util.GSet: Computing capacity for map BlocksMap
20/08/14 16:00:00 INFO util.GSet: VM type       = 64-bit
20/08/14 16:00:00 INFO util.GSet: 2.0% max memory 958.5 MB = 19.2 MB
20/08/14 16:00:00 INFO util.GSet: capacity      = 2^21 = 2097152 entries
20/08/14 16:00:00 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
20/08/14 16:00:00 WARN conf.Configuration: No unit for dfs.heartbeat.interval(3) assuming SECONDS
20/08/14 16:00:00 WARN conf.Configuration: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS
20/08/14 16:00:00 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
20/08/14 16:00:00 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
20/08/14 16:00:00 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
20/08/14 16:00:00 INFO blockmanagement.BlockManager: defaultReplication         = 2
20/08/14 16:00:00 INFO blockmanagement.BlockManager: maxReplication             = 512
20/08/14 16:00:00 INFO blockmanagement.BlockManager: minReplication             = 1
20/08/14 16:00:00 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
20/08/14 16:00:00 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
20/08/14 16:00:00 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
20/08/14 16:00:00 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
20/08/14 16:00:00 INFO namenode.FSNamesystem: Append Enabled: true
20/08/14 16:00:00 INFO namenode.FSDirectory: GLOBAL serial map: bits=24 maxEntries=16777215
20/08/14 16:00:00 INFO util.GSet: Computing capacity for map INodeMap
20/08/14 16:00:00 INFO util.GSet: VM type       = 64-bit
20/08/14 16:00:00 INFO util.GSet: 1.0% max memory 958.5 MB = 9.6 MB
20/08/14 16:00:00 INFO util.GSet: capacity      = 2^20 = 1048576 entries
20/08/14 16:00:00 INFO namenode.FSDirectory: ACLs enabled? false
20/08/14 16:00:00 INFO namenode.FSDirectory: XAttrs enabled? true
20/08/14 16:00:00 INFO namenode.NameNode: Caching file names occurring more than 10 times
20/08/14 16:00:00 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: falseskipCaptureAccessTimeOnlyChange: false
20/08/14 16:00:00 INFO util.GSet: Computing capacity for map cachedBlocks
20/08/14 16:00:00 INFO util.GSet: VM type       = 64-bit
20/08/14 16:00:00 INFO util.GSet: 0.25% max memory 958.5 MB = 2.4 MB
20/08/14 16:00:00 INFO util.GSet: capacity      = 2^18 = 262144 entries
20/08/14 16:00:00 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
20/08/14 16:00:00 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
20/08/14 16:00:00 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
20/08/14 16:00:00 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
20/08/14 16:00:00 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
20/08/14 16:00:00 INFO util.GSet: Computing capacity for map NameNodeRetryCache
20/08/14 16:00:00 INFO util.GSet: VM type       = 64-bit
20/08/14 16:00:00 INFO util.GSet: 0.029999999329447746% max memory 958.5 MB = 294.5 KB
20/08/14 16:00:00 INFO util.GSet: capacity      = 2^15 = 32768 entries
20/08/14 16:00:00 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1547870701-10.189.20.22-1597392000704
20/08/14 16:00:00 INFO common.Storage: Storage directory /apps/svr/hadoop-2.10.0/hadoopdata/dfs/name has been successfully formatted.
20/08/14 16:00:00 INFO namenode.FSImageFormatProtobuf: Saving image file /apps/svr/hadoop-2.10.0/hadoopdata/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
20/08/14 16:00:00 INFO namenode.FSImageFormatProtobuf: Image file /apps/svr/hadoop-2.10.0/hadoopdata/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds .
20/08/14 16:00:00 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
20/08/14 16:00:00 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid = 0 when meet shutdown.
20/08/14 16:00:00 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ns-yun-020022.133.com/隐藏一下
************************************************************/

格式化后会生成下面目录

ls /apps/svr/hadoop-2.10.0/hadoopdata/dfs/*
/apps/svr/hadoop-2.10.0/hadoopdata/dfs/data:
/apps/svr/hadoop-2.10.0/hadoopdata/dfs/journaldata:
ha01
/apps/svr/hadoop-2.10.0/hadoopdata/dfs/name:
current

复制数据到其他服务器

cd /apps/svr/hadoop-2.10.0
scp -r  hadoopdata  ns-yun-020023:`pwd`/.
scp -r  hadoopdata  ns-yun-020024:`pwd`/.

参考 hdfs-site.xml (中 nn1, nn2 定义的其中一台 nameNode ) 只在 namenode 中执行下面命令

[root@ns-yun-020024 hadoop]# hdfs zkfc -formatZK
20/08/14 16:10:21 INFO tools.DFSZKFailoverController: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DFSZKFailoverController
STARTUP_MSG:   host = ns-yun-020024.133.com/10.189.20.24
STARTUP_MSG:   args = [-formatZK]
STARTUP_MSG:   version = 2.10.0
STARTUP_MSG:   classpath = 忽略信息
STARTUP_MSG:   build = ssh://git.corp.linkedin.com:29418/hadoop/hadoop.git -r e2f1f118e465e787d8567dfa6e2f3b72a0eb9194; compiled by 'jhung' on 2019-10-22T19:10Z
STARTUP_MSG:   java = 1.8.0_261
************************************************************/
20/08/14 16:10:21 INFO tools.DFSZKFailoverController: registered UNIX signal handlers for [TERM, HUP, INT]
20/08/14 16:10:21 INFO tools.DFSZKFailoverController: Failover controller configured for NameNode NameNode at ns-yun-020024.133.com/隐藏一下:9000
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.9-1757313, built on 08/23/2016 06:50 GMT
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:host.name=ns-yun-020024.133.com
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_261
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:java.home=/apps/svr/jdk1.8.0_261/jre
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:java.class.path=忽略信息
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/apps/svr/hadoop-2.10.0/lib/native
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:os.version=3.10.0-862.9.1.el7.x86_64
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:user.name=root
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Client environment:user.dir=/apps/svr/hadoop-2.10.0/etc/hadoop
20/08/14 16:10:21 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=ns-yun-020022.133.com:2181,ns-yun-020023.133.com:2181,ns-yun-020024.133.com:2181 sessionTimeout=1000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@2667f029
20/08/14 16:10:21 INFO zookeeper.ClientCnxn: Opening socket connection to server 10.189.20.23/10.189.20.23:2181. Will not attempt to authenticate using SASL (unknown error)
20/08/14 16:10:21 INFO zookeeper.ClientCnxn: Socket connection established to 10.189.20.23/10.189.20.23:2181, initiating session
20/08/14 16:10:22 INFO zookeeper.ClientCnxn: Session establishment complete on server 10.189.20.23/10.189.20.23:2181, sessionid = 0x20aeb5eeb8d0000, negotiated timeout = 4000
20/08/14 16:10:22 INFO ha.ActiveStandbyElector: Session connected.
20/08/14 16:10:22 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ha01 in ZK.
20/08/14 16:10:22 INFO zookeeper.ZooKeeper: Session: 0x20aeb5eeb8d0000 closed
20/08/14 16:10:22 INFO zookeeper.ClientCnxn: EventThread shut down for session: 0x20aeb5eeb8d0000
20/08/14 16:10:22 INFO tools.DFSZKFailoverController: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DFSZKFailoverController at ns-yun-020024.133.com/隐藏一下
************************************************************/

启动 hdfs (只需要在一个机器上启动)

命令 start-dfs.sh 可以启动下面服务

namenode
datanode
journal node
zkfc

参考

[root@ns-yun-020024 hadoop]# start-dfs.sh
Starting namenodes on [ns-yun-020024.133.com ns-yun-020023.133.com]
ns-yun-020024.133.com: starting namenode, logging to /apps/svr/hadoop-2.10.0/logs/hadoop-root-namenode-ns-yun-020024.133.com.out
ns-yun-020023.133.com: starting namenode, logging to /apps/svr/hadoop-2.10.0/logs/hadoop-root-namenode-ns-yun-020023.133.com.out
ns-yun-020024.133.com: datanode running as process 7388. Stop it first.
ns-yun-020022.133.com: datanode running as process 155507. Stop it first.
ns-yun-020023.133.com: datanode running as process 176242. Stop it first.
Starting journal nodes [ns-yun-020022.133.com ns-yun-020023.133.com ns-yun-020024.133.com]
ns-yun-020024.133.com: starting journalnode, logging to /apps/svr/hadoop-2.10.0/logs/hadoop-root-journalnode-ns-yun-020024.133.com.out
ns-yun-020023.133.com: starting journalnode, logging to /apps/svr/hadoop-2.10.0/logs/hadoop-root-journalnode-ns-yun-020023.133.com.out
ns-yun-020022.133.com: starting journalnode, logging to /apps/svr/hadoop-2.10.0/logs/hadoop-root-journalnode-ns-yun-020022.133.com.out
Starting ZK Failover Controllers on NN hosts [ns-yun-020024.133.com ns-yun-020023.133.com]
ns-yun-020024.133.com: zkfc running as process 6683. Stop it first.
ns-yun-020023.133.com: zkfc running as process 175832. Stop it first.

分别验证服务

[root@ns-yun-020022 hadoop]# jps
155507 DataNode
155834 Jps
144379 QuorumPeerMain
155741 JournalNode


[root@ns-yun-020023 hadoop]# jps
176722 Jps
176242 DataNode
176550 JournalNode
175832 DFSZKFailoverController
176424 NameNode
166153 QuorumPeerMain


[root@ns-yun-020024 hadoop]# jps
8036 JournalNode
7720 NameNode
8313 Jps
6683 DFSZKFailoverController
3324 QuorumPeerMain
7388 DataNode

启动 resourcemanager 中随便选择一台进行启动

[root@ns-yun-020024 hadoop]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /apps/svr/hadoop-2.10.0/logs/yarn-root-resourcemanager-ns-yun-020024.133.com.out
ns-yun-020024.133.com: starting nodemanager, logging to /apps/svr/hadoop-2.10.0/logs/yarn-root-nodemanager-ns-yun-020024.133.com.out
ns-yun-020023.133.com: starting nodemanager, logging to /apps/svr/hadoop-2.10.0/logs/yarn-root-nodemanager-ns-yun-020023.133.com.out
ns-yun-020022.133.com: starting nodemanager, logging to /apps/svr/hadoop-2.10.0/logs/yarn-root-nodemanager-ns-yun-020022.133.com.out

验证

[root@ns-yun-020023 hadoop]# jps
176829 NodeManager                   <- 每个机器都增加了 NodeMnanager

注意:
24 多了个 ResourceManager

[root@ns-yun-020024 hadoop]# jps
8962 Jps
8531 NodeManager
8036 JournalNode
7720 NameNode
6683 DFSZKFailoverController
3324 QuorumPeerMain
7388 DataNode
8399 ResourceManager

23 需要手动启动 resourcemanager

[root@ns-yun-020023 hadoop]# jps
176242 DataNode
176550 JournalNode
175832 DFSZKFailoverController
176424 NameNode
166153 QuorumPeerMain
177145 Jps
176829 NodeManager

启动备用 resourcemanager

[root@ns-yun-020023 hadoop]# yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /apps/svr/hadoop-2.10.0/logs/yarn-root-resourcemanager-ns-yun-020023.133.com.out


[root@ns-yun-020023 hadoop]# jps
176242 DataNode
177296 Jps
176550 JournalNode
175832 DFSZKFailoverController
176424 NameNode
177209 ResourceManager
166153 QuorumPeerMain
176829 NodeManager

启动 mapreduce 任务历史服务器 只需要在其中一个 NameNode 中执行

[root@ns-yun-020024 hadoop]# mr-jobhistory-daemon.sh start  historyserver
starting historyserver, logging to /apps/svr/hadoop-2.10.0/logs/mapred-root-historyserver-ns-yun-020024.133.com.out

[root@ns-yun-020024 hadoop]# jps
8531 NodeManager
8036 JournalNode
7720 NameNode
9928 Jps
6683 DFSZKFailoverController
9851 JobHistoryServer                    <--- job server

状态查询

查询节点 HDFS 状态

[root@ns-yun-020023 hadoop]# hdfs haadmin -getServiceState nn1
active
[root@ns-yun-020023 hadoop]# hdfs haadmin -getServiceState nn2
standby

查询节点 YARN 状态

[root@ns-yun-020023 hadoop]# yarn rmadmin -getServiceState rm1
active
[root@ns-yun-020023 hadoop]# yarn rmadmin -getServiceState rm2
standby

web 检测

http://ns-yun-020024:50070
在这里插入图片描述

http://ns-yun-020023:50070
在这里插入图片描述

查看 data node, 可以看到已经添加了 3 个 data node
datanode

hadoop web

http://ns-yun-020024:8088

在这里插入图片描述

hadoop history job web

http://ns-yun-020024:19888
在这里插入图片描述

手动启动 hadoop

启动 journal node

22, 23, 24

hadoop-daemon.sh start journalnode

启动 namenode

23, 24

hadoop-daemon.sh start namenode

参考日志

# cat  /apps/svr/hadoop-2.10.0/logs/hadoop-root-namenode-ns-yun-020023.vclound.com.log
Checkpointing active NN to possible NNs: [http://ns-yun-020024.vclound.com:50070]
Serving checkpoints at http://ns-yun-020023.vclound.com:50070

# cat  /apps/svr/hadoop-2.10.0/logs/hadoop-root-namenode-ns-yun-020024.vclound.com.log
Checkpointing active NN to possible NNs: [http://ns-yun-020023.vclound.com:50070]
Serving checkpoints at http://ns-yun-020024.vclound.com:50070

启动 datanode

22,23, 24

hadoop-daemon.sh start datanode 

参考日志

cat /apps/svr/hadoop-2.10.0/logs/hadoop-root-datanode-ns-yun-020022.vclound.com.log
2020-08-17 20:19:42,282 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0x9c92e3cdca1e8be1,  containing 1 storage report(s), of which we sent 1. The reports had 0 total blocks and used 1 RPC(s). This took 4 msec to generate and 42 msecs for RPC and NN processing. Got back no commands.
2020-08-17 20:19:42,282 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Successfully sent block report 0xe9f315da9193379,  containing 1 storage report(s), of which we sent 1. The reports had 0 total blocks and used 1 RPC(s). This took 4 msec to generate and 42 msecs for RPC and NN processing. Got back no commands.

启动 zkfc

23, 24

hadoop-daemon.sh start zkfc

参考日志

cat /apps/svr/hadoop-2.10.0/logs/hadoop-root-zkfc-ns-yun-020023.vclound.com.log
2020-08-17 20:26:20,724 INFO org.apache.hadoop.ha.ZKFailoverController: ZK Election indicated that NameNode at ns-yun-020023.vclound.com/10.189.20.23:9000 should become standby
2020-08-17 20:26:20,733 INFO org.apache.hadoop.ha.ZKFailoverController: Successfully transitioned NameNode at ns-yun-020023.vclound.com/10.189.20.23:9000 to standby state

启动 yarn

在任意一个机器上启动

start-yarn.sh

参考日志

cat /apps/svr/hadoop-2.10.0/logs/yarn-root-nodemanager-ns-yun-020024.vclound.com.log
2020-08-17 20:28:18,620 INFO org.apache.hadoop.yarn.server.nodemanager.security.NMTokenSecretManagerInNM: Rolling master-key for container-tokens, got key with id -2028131293
2020-08-17 20:28:18,620 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registered with ResourceManager as ns-yun-020024.vclound.com:40490 with total resource of <memory:8192, vCores:8>

启动 resourcemanager

# yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /apps/svr/hadoop-2.10.0/logs/yarn-root-resourcemanager-ns-yun-020023.vclound.com.out

参考日志

cat /apps/svr/hadoop-2.10.0/logs/yarn-root-resourcemanager-ns-yun-020023.vclound.com.log
2020-08-17 20:29:53,779 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Already in standby state
2020-08-17 20:29:53,779 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     OPERATION=transitionToStandby TARGET=RM       RESULT=SUCCESS

启动 histroyserver

# mr-jobhistory-daemon.sh start  historyserver
starting historyserver, logging to /apps/svr/hadoop-2.10.0/logs/mapred-root-historyserver-ns-yun-020024.vclound.com.out

参考日志

# cat /apps/svr/hadoop-2.10.0/logs/mapred-root-historyserver-ns-yun-020024.vclound.com.log
 
2020-08-17 20:32:21,121 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2020-08-17 20:32:21,121 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 10020: starting
.....
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Terry_Tsang

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值