Hadoop源码分析(24)
namenode启动
在文档(5)到文档(8)分析namenode的启动方法,其本质是是创建了一个namenode对象。而在创建这两个namenode对象的时候有一个重要的方法:initialize方法,这个方法会加载namenode持久化到磁盘上的元数据信息,恢复namenode关闭前的状态,然后启动namenode需要的服务。
从文档(9)到文档(23)详细分析了namenode加载元数据的方法,其主要分为两部分:加载FSImage和Editlog。其中FSImage类似于快照文件,记录了指定时刻的Namenode的状态,加载该文件可以恢复该时刻namenode的状态。editlog是namenode的操作日志文件,namenode通过加载editlog,并重新执行日志中的内容,可以将namenode的状态恢复到namenode关闭前的状态。
FSImage和Editlog各自有各自的优点与缺点。NameNode采取了两者相结合的方式来存储与加载元数据。namenode使用txid来将FSImage与EditLog联合起来。一个txid代表了namenode所执行的一个操作,namenode每执行一个操作便会更新一个txid,然后将txid与对应的操作持久化到editlog中去。然后隔一段时间后,namenode会持久化一份FSImage文件,这个文件也会存储当时namenode所用的txid,用来代表FSImage持久化的时间节点。
然后再namenode恢复元数据的时候,会先加载FSImage文件,恢复存储该文件时namenode的状态,同时加载的时候会读取存储FSImage的txid,然后根据这个txid去存储Editlog的地方读取该txid以后的日志文件。最后再将读取到的日志文件加载到namenode中,恢复namenode关闭前的状态。
在namenode的启动过程中,除了上述所说的加载元数据恢复namenode关闭前的状态外,还有一个任务就是启动服务。namenode启动服务分成了两部分,一部分是通常服务,例如http、rpc等等,这些服务是在initialize方法中启动的。还有一部分是较为特殊的服务,这里服务与namenode的角色有关,这些服务的启动在state的enterState方法中。
这里首先分析http,rpc等通常服务,在initialize方法中启动http的方法如下:
这里的startHttpServer方法内容如下:
private void startHttpServer(final Configuration conf) throws IOException {
httpServer = new NameNodeHttpServer(conf, this, getHttpServerBindAddress(conf));
httpServer.start();
httpServer.setStartupProgress(startupProgress);
}
这里主要是启动一个httpserver并启动。
然后是RPC服务,其创建方法在initialize方法中,调用的方法如下:
这个方法的具体内容如下:
protected NameNodeRpcServer createRpcServer(Configuration conf)
throws IOException {
return new NameNodeRpcServer(conf, this);
}
这里是创建了一个NameNodeRpcServer类,这个类的构造方法如下:
public NameNodeRpcServer(Configuration conf, NameNode nn)
throws IOException {
this.nn = nn;
this.namesystem = nn.getNamesystem();
this.retryCache = namesystem.getRetryCache();
this.metrics = NameNode.getNameNodeMetrics();
int handlerCount =
conf.getInt(DFS_NAMENODE_HANDLER_COUNT_KEY,
DFS_NAMENODE_HANDLER_COUNT_DEFAULT);
RPC.setProtocolEngine(conf, ClientNamenodeProtocolPB.class,
ProtobufRpcEngine.class);
ClientNamenodeProtocolServerSideTranslatorPB
clientProtocolServerTranslator =
new ClientNamenodeProtocolServerSideTranslatorPB(this);
BlockingService clientNNPbService = ClientNamenodeProtocol.
newReflectiveBlockingService(clientProtocolServerTranslator);
DatanodeProtocolServerSideTranslatorPB dnProtoPbTranslator =
new DatanodeProtocolServerSideTranslatorPB(this);
BlockingService dnProtoPbService = DatanodeProtocolService
.newReflectiveBlockingService(dnProtoPbTranslator);
NamenodeProtocolServerSideTranslatorPB namenodeProtocolXlator =
new NamenodeProtocolServerSideTranslatorPB(this);
BlockingService NNPbService = NamenodeProtocolService
.newReflectiveBlockingService(namenodeProtocolXlator);
RefreshAuthorizationPolicyProtocolServerSideTranslatorPB refreshAuthPolicyXlator =
new RefreshAuthorizationPolicyProtocolServerSideTranslatorPB(this);
BlockingService refreshAuthService = RefreshAuthorizationPolicyProtocolService
.newReflectiveBlockingService(refreshAuthPolicyXlator);
RefreshUserMappingsProtocolServerSideTranslatorPB refreshUserMappingXlator =
new RefreshUserMappingsProtocolServerSideTranslatorPB(this);
BlockingService refreshUserMappingService = RefreshUserMappingsProtocolService
.newReflectiveBlockingService(refreshUserMappingXlator);
RefreshCallQueueProtocolServerSideTranslatorPB refreshCallQueueXlator =
new RefreshCallQueueProtocolServerSideTranslatorPB(this);
BlockingService refreshCallQueueService = RefreshCallQueueProtocolService
.newReflectiveBlockingService(refreshCallQueueXlator);
GenericRefreshProtocolServerSideTranslatorPB genericRefreshXlator =
new GenericRefreshProtocolServerSideTranslatorPB(this);
BlockingService genericRefreshService = GenericRefreshProtocolService
.newReflectiveBlockingService(genericRefreshXlator);
GetUserMappingsProtocolServerSideTranslatorPB getUserMappingXlator =
new GetUserMappingsProtocolServerSideTranslatorPB(this);
BlockingService getUserMappingService = GetUserMappingsProtocolService
.newReflectiveBlockingService(getUserMappingXlator);
HAServiceProtocolServerSideTranslatorPB haServiceProtocolXlator =
new HAServiceProtocolServerSideTranslatorPB(this);
BlockingService haPbService = HAServiceProtocolService
.newReflectiveBlockingService(haServiceProtocolXlator);
TraceAdminProtocolServerSideTranslatorPB traceAdminXlator =
new TraceAdminProtocolServerSideTranslatorPB(this);
BlockingService traceAdminService = TraceAdminService
.newReflectiveBlockingService(traceAdminXlator);
WritableRpcEngine.ensureInitialized();
InetSocketAddress serviceRpcAddr = nn.getServiceRpcServerAddress(conf);
if (serviceRpcAddr != null) {
String bindHost = nn.getServiceRpcServerBindHost(conf);
if (bindHost == null) {
bindHost = serviceRpcAddr.getHostName();
}
LOG.info("Service RPC server is binding to " + bindHost + ":" +
serviceRpcAddr.getPort());
int serviceHandlerCount =
conf.getInt(DFS_NAMENODE_SERVICE_HANDLER_COUNT_KEY,
DFS_NAMENODE_SERVICE_HANDLER_COUNT_DEFAULT);
this.serviceRpcServer = new RPC.Builder(conf)
.setProtocol(
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB.class)
.setInstance(clientNNPbService)
.setBindAddress(bindHost)
.setPort(serviceRpcAddr.getPort()).setNumHandlers(serviceHandlerCount)
.setVerbose(false)
.setSecretManager(namesystem.getDelegationTokenSecretManager())
.build();
// Add all the RPC protocols that the namenode implements
DFSUtil.addPBProtocol(conf, HAServiceProtocolPB.class, haPbService,
serviceRpcServer);
DFSUtil.addPBProtocol(conf, NamenodeProtocolPB.class, NNPbService,
serviceRpcServer);
DFSUtil.addPBProtocol(conf, DatanodeProtocolPB.class, dnProtoPbService,
serviceRpcServer);
DFSUtil.addPBProtocol(conf, RefreshAuthorizationPolicyProtocolPB.class,
refreshAuthService, serviceRpcServer);
DFSUtil.addPBProtocol(conf, RefreshUserMappingsProtocolPB.class,
refreshUserMappingService, serviceRpcServer);
// We support Refreshing call queue here in case the client RPC queue is full
DFSUtil.addPBProtocol(conf, RefreshCallQueueProtocolPB.class,
refreshCallQueueService, serviceRpcServer);
DFSUtil.addPBProtocol(conf, GenericRefreshProtocolPB.class,
genericRefreshService, serviceRpcServer);
DFSUtil.addPBProtocol(conf, GetUserMappingsProtocolPB.class,
getUserMappingService, serviceRpcServer);
DFSUtil.addPBProtocol(conf, TraceAdminProtocolPB.class,
traceAdminService, serviceRpcServer);
// Update the address with the correct port
InetSocketAddress listenAddr = serviceRpcServer.getListenerAddress();
serviceRPCAddress = new InetSocketAddress(
serviceRpcAddr.getHostName(), listenAddr.getPort());
nn.setRpcServiceServerAddress(conf, serviceRPCAddress);
} else {
serviceRpcServer = null;
serviceRPCAddress = null;
}
InetSocketAddress rpcAddr = nn.getRpcServerAddress(conf);
String bindHost = nn.getRpcServerBindHost(conf);
if (bindHost == null) {
bindHost = rpcAddr.getHostName();
}
LOG.info("RPC server is binding to " + bindHost + ":" + rpcAddr.getPort());
this.clientRpcServer = new RPC.Builder(conf)
.setProtocol(
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB.class)
.setInstance(clientNNPbService).setBindAddress(bindHost)
.setPort(rpcAddr.getPort()).setNumHandlers(handlerCount)
.setVerbose(false)
.setSecretManager(namesystem.getDelegationTokenSecretManager()).build();
// Add all the RPC protocols that the namenode implements
DFSUtil.addPBProtocol(conf, HAServiceProtocolPB.class, haPbService,
clientRpcServer);
DFSUtil.addPBProtocol(conf, NamenodeProtocolPB.class, NNPbService,
clientRpcServer);
DFSUtil.addPBProtocol(conf, DatanodeProtocolPB.class, dnProtoPbService,
clientRpcServer);
DFSUtil.addPBProtocol(conf, RefreshAuthorizationPolicyProtocolPB.class,
refreshAuthService, clientRpcServer);
DFSUtil.addPBProtocol(conf, RefreshUserMappingsProtocolPB.class,
refreshUserMappingService, clientRpcServer);
DFSUtil.addPBProtocol(conf, RefreshCallQueueProtocolPB.class,
refreshCallQueueService, clientRpcServer);
DFSUtil.addPBProtocol(conf, GenericRefreshProtocolPB.class,
genericRefreshService, clientRpcServer);
DFSUtil.addPBProtocol(conf, GetUserMappingsProtocolPB.class,
getUserMappingService, clientRpcServer);
DFSUtil.addPBProtocol(conf, TraceAdminProtocolPB.class,
traceAdminService, clientRpcServer);
// set service-level authorization security policy
if (serviceAuthEnabled =
conf.getBoolean(
CommonConfigurationKeys.HADOOP_SECURITY_AUTHORIZATION, false)) {
clientRpcServer.refreshServiceAcl(conf, new HDFSPolicyProvider());
if (serviceRpcServer != null) {
serviceRpcServer.refreshServiceAcl(conf, new HDFSPolicyProvider());
}
}
// The rpc-server port can be ephemeral... ensure we have the correct info
InetSocketAddress listenAddr = clientRpcServer.getListenerAddress();
clientRpcAddress = new InetSocketAddress(
rpcAddr.getHostName(), listenAddr.getPort());
nn.setRpcServerAddress(conf, clientRpcAddress);
minimumDataNodeVersion = conf.get(
DFSConfigKeys.DFS_NAMENODE_MIN_SUPPORTED_DATANODE_VERSION_KEY,
DFSConfigKeys.DFS_NAMENODE_MIN_SUPPORTED_DATANODE_VERSION_DEFAULT);
// Set terse exception whose stack trace won't be logged
this.clientRpcServer.addTerseExceptions(SafeModeException.class,
FileNotFoundException.class,
HadoopIllegalArgumentException.class,
FileAlreadyExistsException.class,
InvalidPathException.class,
ParentNotDirectoryException.class,
UnresolvedLinkException.class,
AlreadyBeingCreatedException.class,
QuotaExceededException.class,
RecoveryInProgressException.class,
AccessControlException.class,
InvalidToken.class,
LeaseExpiredException.class,
NSQuotaExceededException.class,
DSQuotaExceededException.class,
AclException.class,
FSLimitException.PathComponentTooLongException.class,
FSLimitException.MaxDirectoryItemsExceededException.class,
UnresolvedPathException.class);
}
这里的重点在第127行,调用RPC的Builder类的build方法来创建RPC服务。这与之前文档分析的Journalnode启动RPC服务的方法相同。
然后是特殊服务的启动,在namenode的构造方法有以下代码:
这里会调用的state的enterState方法来进入该namenode的角色。state是由initialize方法调用的createHAState方法创建的,这个方法内容如下:
protected HAState createHAState(StartupOption startOpt) {
if (!haEnabled || startOpt == StartupOption.UPGRADE
|| startOpt == StartupOption.UPGRADEONLY) {
return ACTIVE_STATE;
} else {
return STANDBY_STATE;
}
}
从代码中可看出,在高可用的情况先返回的都是STANDBY_STATE状态,这里的STANDBY_STATE的值的定义如下:
public static final HAState STANDBY_STATE = new StandbyState();
所以上文的enterState方法实际调用的是这里的StandbyState类的方法,该方法内容如下:
@Override
public void enterState(HAContext context) throws ServiceFailedException {
try {
context.startStandbyServices();
} catch (IOException e) {
throw new ServiceFailedException("Failed to start standby services", e);
}
}
这里就一句代码,即第4行调用的context的startStandbyServices方法,这个方法的内容如下:
@Override
public void startStandbyServices() throws IOException {
try {
namesystem.startStandbyServices(conf);
} catch (Throwable t) {
doImmediateShutdown(t);
}
}
这里是继续掉用startStandbyServices方法,这个方法内容如下:
void startStandbyServices(final Configuration conf) throws IOException {
LOG.info("Starting services required for standby state");
if (!getFSImage().editLog.isOpenForRead()) {
// During startup, we're already open for read.
getFSImage().editLog.initSharedJournalsForRead();
}
blockManager.setPostponeBlocksFromFuture(true);
// Disable quota checks while in standby.
dir.disableQuotaChecks();
editLogTailer = new EditLogTailer(this, conf);
editLogTailer.start();
if (standbyShouldCheckpoint) {
standbyCheckpointer = new StandbyCheckpointer(conf, this);
standbyCheckpointer.start();
}
}
这里的重点是第12行启动了EditLogTailer服务,这个服务主要负责从journalnode中读取editlog日志并执行(目的:及时同步active namenode的数据)。然后是第15行启动的standbyCheckpointer服务,这个服务负责在指定时间执行checkpointer方法,将此时的namenode的状态写入到FSImage文件中。