/**
* Server-side implementation for processing state alignment info in
* requests.
* For Observer it compares the client and the server states and determines
* if it makes sense to wait until the server catches up with the client
* state. If not the server throws RetriableException so that the client
* could retry the call according to the retry policy with another Observer
* or the Active NameNode.
*
* @param header The RPC request header.
* @param clientWaitTime time in milliseconds indicating how long client
* waits for the server response. It is used to verify if the client's
* state is too far ahead of the server's
* @return the minimum of the state ids of the client or the server.
* @throws RetriableException if Observer is too far behind.
*/
@Override
public long receiveRequestState(RpcRequestHeaderProto header,
long clientWaitTime) throws IOException {
if (!header.hasStateId() &&
HAServiceState.OBSERVER.equals(namesystem.getState())) {
// This could happen if client configured with non-observer proxy provider
// (e.g., ConfiguredFailoverProxyProvider) is accessing a cluster with
// observers. In this case, we should let the client failover to the
// active node, rather than potentially serving stale result (client
// stateId is 0 if not set).
throw new StandbyException("Observer Node received request without "
+ "stateId. This mostly likely is because client is not configured "
+ "with " + ObserverReadProxyProvider.class.getSimpleName());
}
long serverStateId = getLastSeenStateId();
long clientStateId = header.getStateId();
FSNamesystem.LOG.trace("Client State ID= {} and Server State ID= {}",
clientStateId, serverStateId);
if (clientStateId > serverStateId &&
HAServiceState.ACTIVE.equals(namesystem.getState())) {
FSNamesystem.LOG.warn("The client stateId: {} is greater than "
+ "the server stateId: {} This is unexpected. "
+ "Resetting client stateId to server stateId",
clientStateId, serverStateId);
return serverStateId;
}
if (HAServiceState.OBSERVER.equals(namesystem.getState()) &&
clientStateId - serverStateId >
ESTIMATED_TRANSACTIONS_PER_SECOND
* TimeUnit.MILLISECONDS.toSeconds(clientWaitTime)
* ESTIMATED_SERVER_TIME_MULTIPLIER) {
throw new RetriableException(
"Observer Node is too far behind: serverStateId = "
+ serverStateId + " clientStateId = " + clientStateId);
}
return clientStateId;
}
该方法在org.apache.hadoop.ipc.Server.Connection#processRpcRequest方法中被调用,主要作用是在rpc请求打入到服务端之前进行处理(客户端代码逻辑),从请求中获取客户端的stateid,并将获取到的id放入到请求中,然后传入到router或者namenode进行处理。
在该方法返回正确的stateid之前会做条件判断,主要有三个
-
当header中不含有stateid但是当前处理的server又是observer时,抛出StandbyException,报当前observer节点没有收到stateid,可能的原因是配置类没有配置ObserverReadProxyProvider
-
分别从namenode中获取最新的stateid,从header中获取客户端携带的stateid,如果客户端的stateid大于 服务端的id,且服务端的状态为active,则直接返回值,且返回值是服务端的id,此时header中携带的id是最新的,active中能满足其查询需求。
clientStateId > serverStateId && AServiceState.ACTIVE.equals(namesystem.getState()
- 分别从namenode中获取最新的stateid,从header中获取客户端携带的stateid,如果服务端的状态为observer时,且客户端的stateid大于服务端的id,并且落后的事务id差距是大于10000200.8=16w时,则说明observer的元数据远远没有赶上active nn。直接抛出重试异常:
throw new RetriableException(
"Observer Node is too far behind: serverStateId = "
+ serverStateId + " clientStateId = " + clientStateId);
计算公式10000200.8=16w出自下面代码:
/**
* Estimated number of journal transactions a typical NameNode can execute
* per second. The number is used to estimate how long a client's
* RPC request will wait in the call queue before the Observer catches up
* with its state id.
*/
private static final long ESTIMATED_TRANSACTIONS_PER_SECOND = 10000L;
该常量定义了典型NameNode每秒预计能执行的日志事务数量,用于估算客户端RPC请求在调用队列中等待的时间,直到Observer同步其状态ID。
/**
* The client wait time on an RPC request is composed of
* the server execution time plus the communication time.
* This is an expected fraction of the total wait time spent on
* server execution.
*/
private static final float ESTIMATED_SERVER_TIME_MULTIPLIER = 0.8f;
在RPC请求中,预计客户端等待时间中有80%是服务器执行时间所占的比例,其余为通信时间
this.maxIdleTime = 2 * conf.getInt(
CommonConfigurationKeysPublic.IPC_CLIENT_CONNECTION_MAXIDLETIME_KEY,
CommonConfigurationKeysPublic.IPC_CLIENT_CONNECTION_MAXIDLETIME_DEFAULT);
- /** Default value for IPC_CLIENT_CONNECTION_MAXIDLETIME_KEY */
public static final int IPC_CLIENT_CONNECTION_MAXIDLETIME_DEFAULT = 10000; // 10s