Hadoop和Yarn故障切换高可用原理

Hadoop和Yarn故障切换高可用原理

高可用相关参数

Configuration PropertiesDescription
hadoop.zk.addressAddress of the ZK-quorum. Used both for the state-store and embedded leader-election.
yarn.resourcemanager.ha.enabledEnable RM HA.
yarn.resourcemanager.ha.rm-idsList of logical IDs for the RMs. e.g., “rm1,rm2”.
yarn.resourcemanager.hostname.rm-idFor each rm-id, specify the hostname the RM corresponds to. Alternately, one could set each of the RM’s service addresses.
yarn.resourcemanager.address.rm-idFor each rm-id, specify host:port for clients to submit jobs. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
yarn.resourcemanager.scheduler.address.rm-idFor each rm-id, specify scheduler host:port for ApplicationMasters to obtain resources. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
yarn.resourcemanager.resource-tracker.address.rm-idFor each rm-id, specify host:port for NodeManagers to connect. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
yarn.resourcemanager.admin.address.rm-idFor each rm-id, specify host:port for administrative commands. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
yarn.resourcemanager.webapp.address.rm-idFor each rm-id, specify host:port of the RM web application corresponds to. You do not need this if you set yarn.http.policy to HTTPS_ONLY. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
yarn.resourcemanager.webapp.https.address.rm-idFor each rm-id, specify host:port of the RM https web application corresponds to. You do not need this if you set yarn.http.policy to HTTP_ONLY. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
yarn.resourcemanager.ha.idIdentifies the RM in the ensemble. This is optional; however, if set, admins have to ensure that all the RMs have their own IDs in the config.
yarn.resourcemanager.ha.automatic-failover.enabledEnable automatic failover; By default, it is enabled only when HA is enabled.
yarn.resourcemanager.ha.automatic-failover.embeddedUse embedded leader-elector to pick the Active RM, when automatic failover is enabled. By default, it is enabled only when HA is enabled.
yarn.resourcemanager.cluster-idIdentifies the cluster. Used by the elector to ensure an RM doesn’t take over as Active for another cluster.
yarn.client.failover-proxy-providerThe class to be used by Clients, AMs and NMs to failover to the Active RM.
yarn.client.failover-no-ha-proxy-providerThe class to be used by Clients, AMs and NMs to failover to the Active RM, when not running in HA mode
yarn.client.failover-max-attemptsThe max number of times FailoverProxyProvider should attempt failover.
yarn.client.failover-sleep-base-msThe sleep base (in milliseconds) to be used for calculating the exponential delay between failovers.
yarn.client.failover-sleep-max-msThe maximum sleep time (in milliseconds) between failovers.
yarn.client.failover-retriesThe number of retries per attempt to connect to a ResourceManager.
yarn.client.failover-retries-on-socket-timeoutsThe number of retries per attempt to connect to a ResourceManager on socket timeouts.

故障转移原理

通常调用Yarn和Hadoop的客户端api时都会经过他们的一层代理类,每次调用代理类方法失败时会进行重试或者故障转移,代理类的创建是通过RetryProxy#create创建的,例如RMProxy、ServerProxy、NameNodeProxies都会通过它创建相应的客户端代理。

下面主要解析一下YarnClient使用的RMProxy(默认Hadoop2):

  1. 创建RMProxy

    RMProxy#createRMProxy

  2. 创建故障切换代理供应者FailoverProxyProvider,默认为ConfiguredRMFailoverProxyProvider,用于出现故障时切换节点,对应配置yarn.client.failover-proxy-provider

    RMProxy#createRMFailoverProxyProvider

  3. 创建失败重试策略,对应配置yarn.resourcemanager.connect.max-wait.msyarn.resourcemanager.connect.retry-interval.msyarn.client.failover-sleep-base-msyarn.client.failover-sleep-max-msyarn.client.failover-max-attempts

    如果没有配置:yarn.client.failover-max-attempts

    最大重试次数为:yarn.resourcemanager.connect.max-wait.ms / yarn.client.failover-sleep-max-ms

    如果yarn.client.failover-max-attempts为0,那么就没有故障切换

    RMProxy#createRetryPolicy

  4. 初始化高可用配置,对应配置yarn.resourcemanager.ha.rm-idsyarn.resourcemanager.address.rm-id

    ConfiguredRMFailoverProxyProvider#init

  5. 代理层调用方法

    RetryInvocationHandler#invoke

    (Hadoop3)RetryInvocationHandler.Call#invokeOnce

  6. 判断是否要进行重试

    RetryPolicy#shouldRetry

  7. 方法调用异常时主从切换

    ConfiguredRMFailoverProxyProvider#performFailover

    (Hadoop3)RetryInvocationHandler.ProxyDescriptor#failover

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值