【zookeeper】错误处理机制

为什么要了解Zookeeper的错误处理机制?

Life would be so much easier if failures never happened. Of course, without failures,much of the need for ZooKeeper would also go away. To effectively use ZooKeeper it is important to understand the kinds of failures that happen and how to handle them.

要有效地使用ZooKeeper,理解错误如何出现和如何处理是很重要的。

Zookeeper暴露两种错误(可恢复的和不可恢复的)

ZooKeeper exposes two classes of failures: recoverable and unrecoverable.

Recoverable failures are transient and should be considered relatively normal—things happen. Brief network hiccups and server failures can cause these kinds of failures. Developers should write their code so that their applications keep running in spite of these failures.
Unrecoverable failures are much more problematic. These kinds of failures cause the ZooKeeper handle to become inoperable. The easiest and most common way to deal with this kind of failure is to exit the application. Examples of causes of this class of failure are session timeouts, network outages for longer than the session timeout, and authentication failures.

Zookeeper暴露了两个类故障:可回收和不可恢复的。
可恢复故障是短暂的,也是比较正常的。短暂的网络故障和服务器故障可能会导致这类故障。开发人员应该写自己的代码,以便保证应用程序能够一直运行
不可恢复的故障比较棘手。这些类型的故障导致Zookeeper处理无法操作。最简单常用的方法是退出应用程序。这类故障的原因可能是会话超时,网络中断的时间大于会话超时时间,验证失败。

可恢复故障

A typical cause of Disconnected events and ConnectionLossExceptions is a ZooKeeper server failure


Figure 5-5 illustrates the corner case that causes us to miss the creation event of a watched znode. In this example, the client is watching for the creation of /event. However, just as the /event is created by another client, the watching client loses its connection to ZooKeeper. During this time the other client deletes /event, so when the watching client reconnects to ZooKeeper and reregisters its watch, the ZooKeeper server no longer has the /event znode. Thus, when it processes the registered watches and sees the watch for /event, and sees that there is no node called /event, it simply reregisters the watch,causing the client to miss the creation event for /event. Because of this corner case, you should try to avoid watching for the creation event of a znode. If you do watch for a creation event it should be for a long-lived znode; otherwise, this corner case can bite you.

不可恢复故障

1. At t1, c1 becomes unresponsive due to overload and stops communicating with ZooKeeper. It has queued up changes to the external resource but has not yet received the CPU cycles to send them.
2. At t2, ZooKeeper declares c1’s session with ZooKeeper dead. At this time it also deletes all ephemeral nodes associated with c1’s sessions, including the ephemeral node that it created to become the master.
3. At t3, c2 becomes the master.
4. At t4, c2 changes the state of the external resource.
5. At t5, c1’s overload subsides and it sends its queued changes to the external resource.
6. At t6, c1 is able to reconnect to ZooKeeper, finds out that its session has expired, and relinquishes mastership. Unfortunately, the damage has been done: at time t5, changes were made to the external resource, resulting in corruption.

Figure 5-7 shows how this technique solves the scenario of Figure 5-6. When c1 becomes the leader at time t1, the creation zxid of the /leader znode is 3 (in reality, the zxid would be a much larger number). It supplies the creation zxid as the fencing token to connect with the database. Later, when c1 becomes unresponsive due to overload, ZooKeeper declares c1 as failed and c2 becomes the new leader at time t2. c2 uses 4 as its fencing token because the /leader znode it created has a creation zxid of 4. At time t3, c2 starts making requests to the database using its fencing token. Now when c1’s request arrives at the database at time t4, it is rejected because its fencing token (3) is lower than the highestseen fencing token (4), thus avoiding corruption.



  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值