【大数据学习】Zookeeper

Time in ZooKeeper
ZooKeeper tracks time multiple ways:

  • Zxid
    Every change to the ZooKeeper state receives a stamp in the form of a zxid (ZooKeeper Transaction Id). This exposes the total ordering of all changes to ZooKeeper. Each change will have a unique zxid and if zxid1 is smaller than zxid2 then zxid1 happened before zxid2.
  • Version numbers
    Every change to a node will cause an increase to one of the version numbers of that node. The three version numbers are version (number of changes to the data of a znode), cversion (number of changes to the children of a znode), and aversion (number of changes to the ACL of a znode).
  • Ticks
    When using multi-server ZooKeeper, servers use ticks to define timing of events such as status uploads, session timeouts, connection timeouts between peers, etc. The tick time is only indirectly exposed through the minimum session timeout (2 times the tick time); if a client requests a session timeout less than the minimum session timeout, the server will tell the client that the session timeout is actually the minimum session timeout.
  • Real time
    ZooKeeper doesn’t use real time, or clock time, at all except to put timestamps into the stat structure on znode creation and znode modification.

ZooKeeper Stat Structure
The Stat.java structure for each znode in ZooKeeper is made up of the following fields:

private long czxid;
private long mzxid;
private long ctime;
private long mtime;
private int version;
private int cversion;
private int aversion;
private long ephemeralOwner;
private int dataLength;
private int numChildren;
private long pzxid;
czxid

The zxid of the change that caused this znode to be created.

mzxid

The zxid of the change that last modified this znode.

pzxid

The zxid of the change that last modified children of this znode.

ctime

The time in milliseconds from epoch when this znode was created.

mtime

The time in milliseconds from epoch when this znode was last modified.

version

The number of changes to the data of this znode.

cversion

The number of changes to the children of this znode.

aversion

The number of changes to the ACL of this znode.

ephemeralOwner

The session id of the owner of this znode if the znode is an ephemeral node. If it is not an ephemeral node, it will be zero.

dataLength

The length of the data field of this znode.

numChildren

The number of children of this znode.

Semantics of Watches
We can set watches with the three calls that read the state of ZooKeeper: exists, getData, and getChildren. The following list details the events that a watch can trigger and the calls that enable them:

Created event:

Enabled with a call to exists.

Deleted event:

Enabled with a call to exists, getData, and getChildren.

Changed event:

Enabled with a call to exists and getData.

Child event:

Enabled with a call to getChildren.

什么是脑裂问题?
集群的脑裂通常是发生在节点之间通信不可达的情况下,集群会分裂成不同的小集群,小集群各自选出自己的master节点,导致原有的集群出现多个master节点的情况,这就是脑裂。

ZK存在有脑裂问题吗?
不存在。首先明确zookeeper选举的规则:leader选举,要求 可用节点数量 > 总节点数量/2 。注意 是 > , 不是 ≥。即必须超过半数的选民投票选你,你才能当选国家总统。
https://blog.csdn.net/u010476994/article/details/79806041

ZK的节点数为什么一般是奇数个?

  1. 防止由脑裂造成的集群不可用。
  2. 在容错能力相同的情况下,奇数个节点更节省资源。

ZK如何保证并发写的顺序性?
所有写请求都会被转发到leader节点,follower会通过单独的端向leader报告, 集群配置里面有专门指定,如下 zoo.cfg配置:
clientPort=4180 #用来接受客户端读请求的监听端口
server.A=B:C:D

	A 是一个数字,表示这个是第几号服务器;
	B 是这个服务器的 ip地址;
	C 表示的是这个服务器与集群中的 Leader 服务器交换信息的端口;也就是说,该端口是leader接收写请求的端口。
	D 表示的是follower之间选举时通信用的端口。

server.0=127.0.0.1:8880:7770
server.1=127.0.0.1:8881:7771
server.2=127.0.0.1:8882:7772

当所有请求被转发到leader后,leader会同步的维护一个全局写事务编号 mzxid, 该序号是递增的。如下:
znode数据

每条写请求都对应一个全局唯一的zxid, 同时leader会将该写请求转发至follower, follower在写的时候会拿当前的全局编号与对应写请求上的zxid进行匹配,写请求zxid-当前全局编号=1,则进行写操作。

Performance
Zookeeper是高性能的吗?雅虎的Zookeeper的开发团队的研究表明,Zookeeper是高性能的。
当读请求的数量远远超过写请求数量时,可以达到较高的性能,因为写请求涉及所有节点的状态同步会比较耗时。读请求远超写请求通常是协调服务的情况。
据来自官网的下图所示,Zookeeper的写请求峰值处理能力是在集群中节点数为3个时,每秒可达到24000个请求左右,ZK集群内的节点数越多,由于follower需要从leader同步,整体的写性能越低;一般来说,ZK集群内的节点数越多,读请求的处理能力越强。
ZooKeeper Throughput as the Read-Write Ratio Varies

小学学过的看图说话

  1. 3个ZK节点情况下,写请求的峰值处理能力为每秒24000个左右。
  2. 3个ZK节点情况下,读请求的峰值处理能力为每秒90000个左右。
  3. 一般来说节点数越多,写请求越耗时,每秒完成的写请求数量越少。
  4. 一般来说节点数越多,读请求处理能力越强,ZK高性能更多体现在读请求上。

Reference
https://zookeeper.apache.org/
https://zookeeper.apache.org/doc/current/zookeeperOver.html#fg_zkPerfRW
https://zookeeper.apache.org/doc/r3.4.9/zookeeperProgrammers.html#ch_zkGuarantees
https://blog.csdn.net/maozhr720/article/details/76737499
https://blog.csdn.net/cadem/article/details/80359270
https://blog.csdn.net/donggua6/article/details/39940397

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值