1、 Client和Server用的tcp连接,并不需要三次握手,(No handshake is required on connection or disconnection)
2、 server上面有连接请求数的限制,超过这个连接请求数,再次连接就会失败(The server has a configurable maximum limit on request sizeand any request that exceeds this limit will result in the socket beingdisconnected)
Partitioning and bootstrapping
1、 Client可以去任意一个broker上查询metadata 信息,这个信息包括有哪些topic,这些topic有多少分区,这些分区在哪些broker上面
2、 Client并不需要轮询kafka集群看这些metadata 信息是否过期,这个client可以取这个metadata 直到遇到以下错误:
(1) a socket error indicating the client cannotcommunicate with a particular broker,
(2) an error code in the response to a requestindicating that this broker no longer hosts the partition for which data wasrequested.
备注:so a request for a particular partition to the wrong brokerwill result in an the NotLeaderForPartition error code (described below).
Partitioning Strategies
1、分区的目的:
a) It balances data and request load over brokers
b) It serves as a way to divvy up processing among consumerprocesses while allowing local state and preserving order within the partition.We call this semantic partitioning.
2、 分区算法:
a) round robin
b) 如果producer的数量比broker的数量多,那么让每一个producer随机的只选择一个broker,会减少很多的tcp连接(因为每个producer只往一个broker上面写东西,只需要维护一个tcp连接)
c) 用key做hash
Batching
so a produce request may contain data to append to manypartitions and a fetch request may pull data from many partitions all at once.