Overview
HDFS内部把各种功能通过各种服务的方式向外部提供。 在启动的时候,HDFS主要启动以下一些服务:
- HTTPServer, 用来动态查看当前系统状态;
- JVMPauseMonitor, 用来记录当前运行的JVM是否曾经暂停过;
- NameNodeResourceChecker, 定期检查当前系统可用的本地目录的可用空间;
- BlockManager, 管理系统中所有与Block相关的信息;
- RPC Server, 主要用来与datanode,second namenode, backup namenode, checkpoint namenode, HA node(Zookeeper)FSClient 等通信;
BlockManager, 与 RPC server是整个系统最重要的两个模块;这里也就只查看这两个模块的初始化启动。
BlockManager 启动
BlockManager 实例化
由FSNamesystem 调用其构造函数完成,实例化过程中主要完成以下信息的初始化:
- 保存一个到FSNamesystem, 及FSclusterStats 的引用,这个两个引用在实现中是同一个对像,即FSNamesystem;
- 完成DataNodeManager的实例化;
- HartbeatManager 初始化;
- 为BlocksMap对像分配2% 的系统内存,内部是一个link-hashmap 的实现,主要保存Block的元数据及Datanode信息;
- BlockPlacementPlolicy对像初始化, 这个对像主要用来选择合适的datanode来存放block及replication block;
- PendingReplicationBlocks: 对像初始化, 该对像主要用来保存那些replication 还没有达到系统配置的Block的信息;
- BlockTokenSecretManager,对像实例化;
- 所有的Blocks相关的配置信息读取;包括:
- defaultReplication
- maxReplication
- minReplication
- maxReplicationStreams
- shouldCheckForEnoughRacks
- replicationRecheckInterval
- encryptDataTransfer
- maxNumBlocksToLog
BlockManager 初始化
在初始化过程中,BlockManger主要完成以下任务:
- pendingReplications 线程启动;
- 启动datanodeManager;
- 启动ReplicationThread, 该线程以固定的时间计算Datanode的负载,并处理所有等待Replicate的Block;
其中最主要的是DataNdeManager的启动;
DataNodeManager 启动
DataNodeManger实例化
DataNodeManager实例化过程中主要完成以下资源的初始化:
- NetworkTopology对像初始化;
- hearbeatManager对像初始化;
- HostFileManager初始化;
- DNSToSwitchMapping对像初始化;
- 所有的DataNode相关的配置信息的读取;
DataNodeManager 初始化
DataNodeManager 初始化主要完成以下任务:
- DecommissionManager的启动, 该对像主要用来监视系统中退投的Datanode, 在日志中记录下来;
- 启动heatBeatManager, 来处理,来自己dataNode的请求;
RPC Server 启动
RPC Server实现在几个重要的通信协议:
- ClientProtocol 用来与在Client 通信;
- DataNodeProtocol, 用来与DataNode 通信;
- NamenodeProtocol, 用来与backupNamenode, secondaryNamenode,及checkpointNamenode通信;
- RefreshAuthorizationPolicyProtocol, 用来与管理工具交互;
- RefreshUserMappingsProtocol, 用来更新user 信息;
- GetUserMappingsProtocol:用来获得当用用户信息;
- HAServiceProtocol: Zookeeper使用该 协议来切换NameNode的状态。、
RPC Server 启动之后,Namenode所有的公共的Service已经就绪, 然后NameNode 根据配置是否启用HA决定进入Standby或Active状态,并启动相应的Service;
如果启用HA功能,则Namenode进入standby 状态,并启动standby 相应的Service 后续由Zookeeper去选择active的namenode;
否则,Namenode直接进入active状态,启动active Service 开始对外提供服务。
HAContext Service 启动
Standby Service 启动
Standby Service的启动比较简单,主是把FSEditLog以只读的方式打开, 并从active读取最新的操日志记录,
LOG.info("Starting services required for standby state");
if (!dir.fsImage.editLog.isOpenForRead()) {
// During startup, we're already open for read.
dir.fsImage.editLog.initSharedJournalsForRead();
}
blockManager.setPostponeBlocksFromFuture(true);
editLogTailer = new EditLogTailer(this, conf);
editLogTailer.start();
if (standbyShouldCheckpoint) {
standbyCheckpointer = new StandbyCheckpointer(conf, this);
standbyCheckpointer.start();
}
Active Service 启动
Active 主要初始化之前service, 并启动两个monitor:
FSEditLog editLog = dir.fsImage.getEditLog();
if (!editLog.isOpenForWrite()) {
// During startup, we're already open for write during initialization.
editLog.initJournalsForWrite();
// May need to recover
editLog.recoverUnclosedStreams();
LOG.info("Catching up to latest edits from old active before " +
"taking over writer role in edits logs");
editLogTailer.catchupDuringFailover();
blockManager.setPostponeBlocksFromFuture(false);
blockManager.getDatanodeManager().markAllDatanodesStale();
blockManager.clearQueues();
blockManager.processAllPendingDNMessages();
if (!isInSafeMode() ||
(isInSafeMode() && safeMode.isPopulatingReplQueues())) {
LOG.info("Reprocessing replication and invalidation queues");
blockManager.processMisReplicatedBlocks();
}
if (LOG.isDebugEnabled()) {
LOG.debug("NameNode metadata after re-processing " +
"replication and invalidation queues during failover:\n" +
metaSaveAsString());
}
long nextTxId = dir.fsImage.getLastAppliedTxId() + 1;
LOG.info("Will take over writing edit logs at txnid " +
nextTxId);
editLog.setNextTxId(nextTxId);
dir.fsImage.editLog.openForWrite();
}
if (haEnabled) {
// Renew all of the leases before becoming active.
// This is because, while we were in standby mode,
// the leases weren't getting renewed on this NN.
// Give them all a fresh start here.
leaseManager.renewAllLeases();
}
leaseManager.startMonitor();
startSecretManagerIfNecessary();
//ResourceMonitor required only at ActiveNN. See HDFS-2914
this.nnrmthread = new Daemon(new NameNodeResourceMonitor());
nnrmthread.start();
如以上代码所示,在active阶段,主要是确认,FSEditlog是以写入的方式打开,设定blockmanager的工作方式,更新所有的datanode的状态;
启动leaseManager monitor, 启动 NameNodeResourceMonitor
至此,所有的的service 启动,然后,主线程进入等待, namenode启动完成;