将zookeeper导入到IDEA,然后配置启动参数,如下图所示。
这里配置三个Application,分别是zk1,zk2,zk3。
QuorumPeerMain类是zookeeper的启动类。通过其注释可以得知
/**
*
* <h2>Configuration file</h2>
*
* When the main() method of this class is used to start the program, the first
* argument is used as a path to the config file, which will be used to obtain
* configuration information. This file is a Properties file, so keys and
* values are separated by equals (=) and the key/value pairs are separated
* by new lines. The following is a general summary of keys used in the
* configuration file. For full details on this see the documentation in
* docs/index.html
* <ol>
* <li>dataDir - The directory where the ZooKeeper data is stored.</li>
* <li>dataLogDir - The directory where the ZooKeeper transaction log is stored.</li>
* <li>clientPort - The port used to communicate with clients.</li>
* <li>tickTime - The duration of a tick in milliseconds. This is the basic
* unit of time in ZooKeeper.</li>
* <li>initLimit - The maximum number of ticks that a follower will wait to
* initially synchronize with a leader.</li>
* <li>syncLimit - The maximum number of ticks that a follower will wait for a
* message (including heartbeats) from the leader.</li>
* <li>server.<i>id</i> - This is the host:port[:port] that the server with the
* given id will use for the quorum protocol.</li>
* </ol>
* In addition to the config file. There is a file in the data directory called
* "myid" that contains the server id as an ASCII decimal value.
*
*/
main方法是启动入口,参数是配置文件地址,注释中要介绍了配置文件中基本配置的作用。其中一个配置文件如下所示。
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=E:IdeaProjectszookeeperzk1data
dataLogDir=E:IdeaProjectszookeeperzk1log
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=localhost:2888:3888
server.2=localhost:2899:3899
server.3=localhost:2877:3877
点击debug按钮开始zookeeper的启动分析
public static void main(String[] args) {
QuorumPeerMain main = new QuorumPeerMain();
try {
main.initializeAndRun(args);
} catch (IllegalArgumentException e) {
LOG.error("Invalid arguments, exiting abnormally", e);
LOG.info(USAGE);
System.err.println(USAGE);
System.exit(2);
} catch (ConfigException e) {
LOG.error("Invalid config, exiting abnormally", e);
System.err.println("Invalid config, exiting abnormally");
System.exit(2);
} catch (Exception e) {
LOG.error("Unexpected exception, exiting abnormally", e);
System.exit(1);
}
LOG.info("Exiting normally");
System.exit(0);
}
main函数中首先生成了QuorumPeerMain 对象,然后调用了main.initializeAndRun(args)方法。
protected void initializeAndRun(String[] args)
throws ConfigException, IOException
{
QuorumPeerConfig config = new QuorumPeerConfig();
if (args.length == 1) {
config.parse(args[0]);
}
// Start and schedule the the purge task
DatadirCleanupManager purgeMgr = new DatadirCleanupManager(config
.getDataDir(), config.getDataLogDir(), config
.getSnapRetainCount(), config.getPurgeInterval());
purgeMgr.start();
if (args.length == 1 && config.servers.size() > 0) {
runFromConfig(config);
} else {
LOG.warn("Either no config or no quorum defined in config, running "
+ " in standalone mode");
// there is only server in the quorum -- run as standalone
ZooKeeperServerMain.main(args);
}
}
下面是代码说明
QuorumPeerConfig config = new QuorumPeerConfig();
if (args.length == 1) {
config.parse(args[0]);
}
生成了QuorumPeerConfig 对象,主要存放配置文件解析后的参数数值。
// Start and schedule the the purge task
DatadirCleanupManager purgeMgr = new DatadirCleanupManager(config
.getDataDir(), config.getDataLogDir(), config
.getSnapRetainCount(), config.getPurgeInterval());
purgeMgr.start();
生成了DatadirCleanupManager 对象,主要功能是清理数据目录,然后start方法。
public void start() {
if (PurgeTaskStatus.STARTED == purgeTaskStatus) {
LOG.warn("Purge task is already running.");
return;
}
// Don't schedule the purge task with zero or negative purge interval.
if (purgeInterval <= 0) {
LOG.info("Purge task is not scheduled.");
return;
}
timer = new Timer("PurgeTask", true);
TimerTask task = new PurgeTask(dataLogDir, snapDir, snapRetainCount);
timer.scheduleAtFixedRate(task, 0, TimeUnit.HOURS.toMillis(purgeInterval));
purgeTaskStatus = PurgeTaskStatus.STARTED;
}
查看start方法,因为purgeInterval 参数是0,所以并没有启动清理。
if (args.length == 1 && config.servers.size() > 0) {
runFromConfig(config);
} else {
LOG.warn("Either no config or no quorum defined in config, running "
+ " in standalone mode");
// there is only server in the quorum -- run as standalone
ZooKeeperServerMain.main(args);
}
判断是单机运行还是集群运行,这里是集群运行。之后进入runFromConfig方法。
public void runFromConfig(QuorumPeerConfig config) throws IOException {
try {
ManagedUtil.registerLog4jMBeans();
} catch (JMException e) {
LOG.warn("Unable to register log4j JMX control", e);
}
LOG.info("Starting quorum peer");
try {
ServerCnxnFactory cnxnFactory = ServerCnxnFactory.createFactory();
cnxnFactory.configure(config.getClientPortAddress(),
config.getMaxClientCnxns());
quorumPeer = getQuorumPeer();
quorumPeer.setQuorumPeers(config.getServers());
quorumPeer.setTxnFactory(new FileTxnSnapLog(
new File(config.getDataLogDir()),
new File(config.getDataDir())));
quorumPeer.setElectionType(config.getElectionAlg());
quorumPeer.setMyid(config.getServerId());
quorumPeer.setTickTime(config.getTickTime());
quorumPeer.setInitLimit(config.getInitLimit());
quorumPeer.setSyncLimit(config.getSyncLimit());
quorumPeer.setQuorumListenOnAllIPs(config.getQuorumListenOnAllIPs());
quorumPeer.setCnxnFactory(cnxnFactory);
quorumPeer.setQuorumVerifier(config.getQuorumVerifier());
quorumPeer.setClientPortAddress(config.getClientPortAddress());
quorumPeer.setMinSessionTimeout(config.getMinSessionTimeout());
quorumPeer.setMaxSessionTimeout(config.getMaxSessionTimeout());
quorumPeer.setZKDatabase(new ZKDatabase(quorumPeer.getTxnFactory()));
quorumPeer.setLearnerType(config.getPeerType());
quorumPeer.setSyncEnabled(config.getSyncEnabled());
// sets quorum sasl authentication configurations
quorumPeer.setQuorumSaslEnabled(config.quorumEnableSasl);
if(quorumPeer.isQuorumSaslAuthEnabled()){
quorumPeer.setQuorumServerSaslRequired(config.quorumServerRequireSasl);
quorumPeer.setQuorumLearnerSaslRequired(config.quorumLearnerRequireSasl);
quorumPeer.setQuorumServicePrincipal(config.quorumServicePrincipal);
quorumPeer.setQuorumServerLoginContext(config.quorumServerLoginContext);
quorumPeer.setQuorumLearnerLoginContext(config.quorumLearnerLoginContext);
}
quorumPeer.setQuorumCnxnThreadsSize(config.quorumCnxnThreadsSize);
quorumPeer.initialize();
quorumPeer.start();
quorumPeer.join();
} catch (InterruptedException e) {
// warn, but generally this is ok
LOG.warn("Quorum Peer interrupted", e);
}
}
下面是代码说明
ServerCnxnFactory cnxnFactory = ServerCnxnFactory.createFactory();
创建了ServerCnxnFactory对象,主要用来处理接收到的请求。ServerCnxnFactory是个抽象类,这里实例化了其子类NIOServerCnxnFactory,即使用java 原生nio api去处理请求,也可以实例化NettyServerCnxnFactory,使用netty去处理请求。
cnxnFactory.configure(config.getClientPortAddress(),
config.getMaxClientCnxns());
之后配置cnxnFactory 参数,配置了端口、地址和最大的客户端连接数。在configure方法中还生成ZooKeeperThread线程来处理客户端请求。
quorumPeer = getQuorumPeer();
生成了QuorumPeer对象。查看下该对象的注释。
/**
* This class manages the quorum protocol. There are three states this server
* can be in:
* <ol>
* <li>Leader election - each server will elect a leader (proposing itself as a
* leader initially).</li>
* <li>Follower - the server will synchronize with the leader and replicate any
* transactions.</li>
* <li>Leader - the server will process requests and forward them to followers.
* A majority of followers must log the request before it can be accepted.
* </ol>
*
* This class will setup a datagram socket that will always respond with its
* view of the current leader. The response will take the form of:
*
* <pre>
* int xid;
*
* long myid;
*
* long leader_id;
*
* long leader_zxid;
* </pre>
*
* The request for the current leader will consist solely of an xid: int xid;
*/
通过注释可以得知这个类主要用来进行选举管理,并且可以检测到目前服务器运行的状态。共有三中状态:
1、Leader election。领导选举
2、Follower 。跟随者状态。
3、Leader 。领导者角色
通过会反馈一些数据包给Leader。其中:
1、xid 该服务器本身的事务ID。
2、myid 该服务器本身的ID。
3、leader_id Leader服务器的ID。
4、leader_zxid Leader服务器的事务ID。
quorumPeer.setTxnFactory(new FileTxnSnapLog(
new File(config.getDataLogDir()),
new File(config.getDataDir())));
quorumPeer.setElectionType(config.getElectionAlg());
.......
quorumPeer.setQuorumCnxnThreadsSize(config.quorumCnxnThreadsSize);
生成QuorumPeer对象后,设置一系列参数。
quorumPeer.start();
调用start函数开始运行。
public synchronized void start() {
loadDataBase();
cnxnFactory.start();
startLeaderElection();
super.start();
}
查看下start的代码
loadDataBase();
调用了loadDataBase()方法。加载了ZKDatabase数据库。该类主要用来创建一个内存数据库实例,用来存储服务器状态、datatree和事务日志。
/**
* load the database from the disk onto memory and also add
* the transactions to the committedlog in memory.
* @return the last valid zxid on disk
* @throws IOException
*/
public long loadDataBase() throws IOException {
long zxid = snapLog.restore(dataTree, sessionsWithTimeouts, commitProposalPlaybackListener);
initialized = true;
return zxid;
}
我对Zookeeper的一些理解——启动流程分析,未完待续。