Config就是配置相关信息,下面是KafkaConfig的源码及小弟的相关注释,有错误的地方还望指出
public class KafkaConfig implements Serializable {
/** 一个借口,实现类有ZkHosts,和StatisHosts **/
public final BrokerHosts hosts;
public final String topic; // kafka topic name
public final String clientId; // 自己取一个唯一的ID吧
public int fetchSizeBytes = 1024 * 1024; // 每次从kafka读取的byte数,这个变量会在KafkaUtils的fetchMessage方法中看到
public int socketTimeoutMs = 10000; // Consumer连接kafka server超时时间
public int fetchMaxWait = 10000;
public int bufferSizeBytes = 1024 * 1024; //Consumer端缓存大小
public MultiScheme scheme = new RawMultiScheme(); // 数据发送的序列化和反序列化定义的Scheme,后续会专门有一篇介绍
public boolean forceFromStart = false; // 和startOffsetTime,一起用,默认情况下,为false,一旦startOffsetTime被设置,就要置为true
public long startOffsetTime = kafka.api.OffsetRequest.EarliestTime(); // -2 从kafka头开始 -1 是从最新的开始 0 =无 从ZK开始
public long maxOffsetBehind = Long.MAX_VALUE; // 每次kafka会读取一批offset存放在list中,当zk offset比当前本地保存的commitOffse相减大于这个值时,重新设置commitOffset为当前zk offset,代码见PartitionManager
public boolean useStartOffsetTimeIfOffsetOutOfRange = true;
public int metricsTimeBucketSizeInSecs = 60;
public KafkaConfig(BrokerHosts hosts, String topic) {
this(hosts, topic, kafka.api.OffsetRequest.DefaultClientId());
}
public KafkaConfig(BrokerHosts hosts, String topic, String clientId) {
this.hosts = hosts;
this.topic = topic;
this.clientId = clientId;
}
}
SpoutConfig继承了KafkaConfig
public class SpoutConfig extends KafkaConfig implements Serializable {
public List<String> zkServers = null; // zk hosts 列表,格式就是简单ip:xxx.xxx.xxx.xxx,作为zkserver ,后续leader election用
public Integer zkPort = null; // zk端口,一般是2181
public String zkRoot = null; // 该参数是Consumer消费的meta信息,保存在zk的路径,自己指定
public String id = null; // 唯一id
public long stateUpdateIntervalMs = 2000; // commit 消费的offset到zk的时间间隔
public SpoutConfig(BrokerHosts hosts, String topic, String zkRoot, String id) {
super(hosts, topic);
this.zkRoot = zkRoot;
this.id = id;
}
}
config的参数,在后续的class中都会用,要创建一个KafkaSpout,需要的构造参数就是SpoutConfig
下一节讲述如何构建一个KafkaSpout