3.2 Producer Configs
Below is the configuration of the Java producer:
下面是java版本的producer的配置文件
bootstrap.servers | A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form
host/port对的列表,用来建立与kafka的初始链接。客户端将使用列表中所有指定的servers-这个列表只影响客户端的初始化,客户端需要使用这个列表去查询所有servers的完整列表。列表格式应该为:host1:port1,host2,port2,....;因为这些server列表只是用来初始化发现完整的server列表(而完整的server列表可能在使用中发生变化,机器损坏,部署迁移等),这个表不需要包含所有server的ip和port(但是最好多于1个,预防这个server挂掉的风险,防止下次启动无法链接) | list | high | ||
key.serializer | Serializer class for key that implements the
Serializer接口的密钥的类的key | class | high | ||
value.serializer | Serializer class for value that implements the
Serializer接口的类的value | class | high | ||
acks | The number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. The following settings are allowed:
producer要求在leader在判定某条消息是否写入完成之前需要收到的确认写入的个数。这个值控制了发送消息的可用性。以下为具体配置说明:
| string | 1 | [all, -1, 0, 1] | high |
buffer.memory | The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will block for max.block.ms after which it will throw an exception. This setting should correspond roughly to the total memory the producer will use, but is not a hard bound since not all memory the producer uses is used for buffering. Some additional memory will be used for compression (if compression is enabled) as well as for maintaining in-flight requests.
producer用于缓存发送数据的内存大小。如果消息放入缓存的速度大于发送的速度,则producer可以设置阻塞超时时间max.block.ms,超时则报异常即可。 这个设置指定了producer将要使用的内存大小,但是并不是一个实际的边界条件,因为producer并不会把所有的内存都用作缓存。一些额外的缓存可能用于压缩(如果支持压缩的话),还有一些缓存用于维护正在进行的请求。 | long | 33554432 | [0,...] | high |
compression.type | The compression type for all data generated by the producer. The default is none (i.e. no compression). Valid values are
producer可以支持的数据压缩类型。合法的压缩格式为:none,gzip,snappy,lz4.压缩时批量进行的,因此批量的大小也会影响压缩的效率(更大的批量可能会有更高的压缩速率) | string | none | high | |
retries | Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries without setting 设置重试次数可以在发送失败时进行重试,提高发送的可靠性。注意,这个重试和客户端发生接受错误的重试没有区别。允许重试,而不设置max.in.flight.request.per.connection为1的话,将可能调整消息的发送次序,例如两组批量消息发送到同一个partition,第一个失败了然后重试,但是第二个发送成功了,实际的结果可能是第二个小组在partition中出现的更早。 | int | 0 | [0,...,2147483647] | high |
ssl.key.password | The password of the private key in the key store file. This is optional for client.
key存储文件中私有密钥的密码。对客户端来说是可选的。 | password | null | high | |
ssl.keystore.location | The location of the key store file. This is optional for client and can be used for two-way authentication for client.
密钥存储的位置。对于客户端来说是可选的,可以使用双向认证 | string | null | high | |
ssl.keystore.password | The store password for the key store file. This is optional for client and only needed if ssl.keystore.location is configured.
密钥文件中密码,这个队客户端来说是可选的,只有在ssl.keystore.location配置的时候才需要 | password | null | high | |
ssl.truststore.location | The location of the trust store file.
信任存储文件的位置 | string | null | high | |
ssl.truststore.password | The password for the trust store file.
受信任文件的密码 | password | null | high | |
batch.size | The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration controls the default batch size in bytes. No attempt will be made to batch records larger than this size. Requests sent to brokers will contain multiple batches, one for each partition with data available to be sent. A small batch size will make batching less common and may reduce throughput (a batch size of zero will disable batching entirely). A very large batch size may use memory a bit more wastefully as we will always allocate a buffer of the specified batch size in anticipation of additional records. producer尝试批量处理消息,可以使用较少的发送次数发送相同数量的消息。可以提高server以及client端的性能。此值是指默认批量处理的字节数。 不要尝试批量发送超过这个值的消息。 发送给brokers的请求可能包含多个批量发送的消息组,每个组都有对应的partition。 较小的批量发送尺寸将降低吞吐量(批量尺寸为0的话将禁止批量发送)。非常大的批量发送尺寸将需要更多的空间,需要预先申请更大的空间。 | int | 16384 | [0,...] | medium |
client.id | An id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.
请求server时传送给server的clientid字符串。 目的在于追踪请求的来源,判断是否从合法ip/port发出的。 | string | "" | medium | |
connections.max.idle.ms | Close idle connections after the number of milliseconds specified by this config.
空闲链接存在的最长时间,超出之后就会被关闭 | long | 540000 | medium | |
linger.ms | The producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out. However in some circumstances the client may want to reduce the number of requests even under moderate load. This setting accomplishes this by adding a small amount of artificial delay—that is, rather than immediately sending out a record the producer will wait for up to the given delay to allow other records to be sent so that the sends can be batched together. This can be thought of as analogous to Nagle's algorithm in TCP. This setting gives the upper bound on the delay for batching: once we get 生产者将批量处理请求,只要落在两次批量请求之间的请求都被集合到一次请求中。通常来说,这只会在以下情况下发生:即请求来的比发送的频繁。然而,即使在中等负载情况下,producer也希望降低请求的次数。该设置通过添加人为的延迟实现这一点:即,请求来到不是立即发送,而是等待指定的延迟,批量发送请求。这类似于TCP中的Nagle算法,此值指出了延迟的上限,一旦我们的请求数量达到一个分区的batch.size,可以立即发送请求而不用管这个值。但是,如果分区的请求的数量没有达到batch.size,则需要延迟此值指定的时间,以等待更多的请求。默认设置为0,即没有延迟。设置linger.ms=5,例如,为了降低请求次数,可能需要等待5ms的时间以获取更多的请求进行批次请求,但是如果在请求数不多的情况下进行延迟,会导致延迟5ms。 | long | 0 | [0,...] | medium |
max.block.ms | The configuration controls how long
KafkaProducer.send()和KafkaProducer.partitionsFor()将会阻塞的时长。这两个方法有可能因为缓存区满了或者元数据不可用而阻塞。由于用户提供的serializers或者partitioner而产生的阻塞不会计入超时。 | long | 60000 | [0,...] | medium |
max.request.size | The maximum size of a request in bytes. This is also effectively a cap on the maximum record size. Note that the server has its own cap on record size which may be different from this. This setting will limit the number of record batches the producer will send in a single request to avoid sending huge requests.
请求的最大字节数。这个也是有效的最长消息的上限。注意,server可能有自己的消息上限,相互之间可能有所不同。这个设置限制了批量处理消息的大小,因此producer单次发送时应该避免发送一大坨请求。 | int | 1048576 | [0,...] | medium |
partitioner.class | Partitioner class that implements the
实现Partitioner的接口的Partitioner类 | class | class org.apache.kafka.clients.producer.internals.DefaultPartitioner | medium | |
receive.buffer.bytes | The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used
TCP接受缓存的大小(SO_RCVBUF)。如果设置为-1,则使用OS默认值. | int | 32768 | [-1,...] | medium |
request.timeout.ms | The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted.
请求的超时时间,即等待server端应答某个请求的最长时间。如果在超时时间内没有收到应答,客户端有可能重试,如果重试都失败了,则本次请求失败。 | int | 30000 | [0,...] | medium |
sasl.kerberos.service.name | The Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config.
kafka运行的Kerberos主机名。可以在Kafka's JAAS配置或者Kafka's 配置中定义。 | string | null | medium | |
sasl.mechanism | SASL mechanism used for client connections. This may be any mechanism for which a security provider is available. GSSAPI is the default mechanism.
客户端链接进行通信的SASL机制。默认时GSSAPI | string | GSSAPI | medium | |
security.protocol | Protocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.
brokers之间通信使用的安全协议。正确值为:PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL. | string | PLAINTEXT | medium | |
send.buffer.bytes | The size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used. TCP发送的socket的SO_SNDBUF缓存。如果设置为-1,将使用OS的默认值 | int | 131072 | [-1,...] | medium |
ssl.enabled.protocols | The list of protocols enabled for SSL connections.
SSL链接的协议 | list | [TLSv1.2, TLSv1.1, TLSv1] | medium | |
ssl.keystore.type | The file format of the key store file. This is optional for client.
密钥文件的文件格式。对客户端来说是可选的。 | string | JKS | medium | |
ssl.protocol | The SSL protocol used to generate the SSLContext. Default setting is TLS, which is fine for most cases. Allowed values in recent JVMs are TLS, TLSv1.1 and TLSv1.2. SSL, SSLv2 and SSLv3 may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities.
生成SSLContext的SSL协议。默认配置时TLS,适用于大部分情况。最近JVMS支持的协议包括:TLS,TLSv1.1,TLSv1.2. | string | TLS | medium | |
ssl.provider | The name of the security provider used for SSL connections. Default value is the default security provider of the JVM.
SSL链接安全提供者名字。默认是JVM | string | null | medium | |
ssl.truststore.type | The file format of the trust store file. 受信任的文件的文件格式 | string | JKS | medium | |
timeout.ms | The configuration controls the maximum amount of time the server will wait for acknowledgments from followers to meet the acknowledgment requirements the producer has specified with the
server等待followers回馈的写入确认个数(producer指定的acks)的超时时间,如果要求的确认个数在超时时间内没有达到,则会返回错误。这个超时是指server端的超时,并且没有包含请求的网络延迟。 | int | 30000 | [0,...] | medium |
block.on.buffer.full | When our memory buffer is exhausted we must either stop accepting new records (block) or throw errors. By default this setting is false and the producer will no longer throw a BufferExhaustException but instead will use themax.block.ms value to block, after which it will throw a TimeoutException. Setting this property to true will set the max.block.ms to Long.MAX_VALUE. Also if this property is set to true, parametermetadata.fetch.timeout.ms is no longer honored. This parameter is deprecated and will be removed in a future release. Parameter
当内存已满,要么停止接受新请求,要么抛出错误。默认时抛出错误,而且producer不再抛出BufferExhaustException,而是阻塞max.block.ms长的时间,阻塞超时后会抛出TimeoutException错误。设置这个值为true,将设置max.block.ms为long.MAX_VALUE,而且不再使用metadata.fetch.timeout.ms。此参数在以后的release版本中会废弃,可以使用max.block.ms | boolean | false | low | |
interceptor.classes | A list of classes to use as interceptors. Implementing the
用作拦截器的类的列表。接口ProducerInterceptor可以拦截部分消息,以防它们发送到kafka集群。默认情况下没有拦截器 | list | null | low | |
max.in.flight.requests.per.connection | The maximum number of unacknowledged requests the client will send on a single connection before blocking. Note that if this setting is set to be greater than 1 and there are failed sends, there is a risk of message re-ordering due to retries (i.e., if retries are enabled).
在阻塞之前,客户端可以在单个链接之中发送未确认的请求的最大数目。注意:如果这个值大于1,则一旦发送失败,有可能会打乱消息的原有顺序 | int | 5 | [1,...] | low |
metadata.fetch.timeout.ms | The first time data is sent to a topic we must fetch metadata about that topic to know which servers host the topic's partitions. This config specifies the maximum time, in milliseconds, for this fetch to succeed before throwing an exception back to the client.
获取某个topic的partitions在servers上分布情况的元数据的超时时间。此值指定了最大时间,用来指定客户端等待server回馈元数据的时间。 | long | 60000 | [0,...] | low |
metadata.max.age.ms | The period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions.
更新metadata的时间间隔,无论partition的leader是否发生变换或者topic其它的元数据是否发生变化。 | long | 300000 | [0,...] | low |
metric.reporters | A list of classes to use as metrics reporters. Implementing the
用于实现指标统计的类的列表。MetricReporter接口允许调用实现指标统计的插件类。JmxReporter总是包含注册JMX统计。 | list | [] | low | |
metrics.num.samples | The number of samples maintained to compute metrics.
维护计算指标的样本数 | int | 2 | [1,...] | low |
metrics.sample.window.ms | The window of time a metrics sample is computed over.
度量样本的计算的时长 | long | 30000 | [0,...] | low |
reconnect.backoff.ms | The amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all requests sent by the consumer to the broker.
重连给定host之前的等待时间。避免频繁的重连某个host。这个backoff时间也设定了consumer请求broker的重试等待时间。 | long | 50 | [0,...] | low |
retry.backoff.ms | The amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.
重新发送失败请求的等待时间。避免某些失败情况下频繁发送请求。 | long | 100 | [0,...] | low |
sasl.kerberos.kinit.cmd | Kerberos kinit command path.
Kerberos kinit命令路径 | string | /usr/bin/kinit | low | |
sasl.kerberos.min.time.before.relogin | Login thread sleep time between refresh attempts.
重试之间,线程的睡眠时间 | long | 60000 | low | |
sasl.kerberos.ticket.renew.jitter | Percentage of random jitter added to the renewal time.
添加到更新时间的随机抖动的百分比。 | double | 0.05 | low | |
sasl.kerberos.ticket.renew.window.factor | Login thread will sleep until the specified window factor of time from last refresh to ticket's expiry has been reached, at which time it will try to renew the ticket.
重新进行登录验证刷新之前,登录线程的睡眠时间 | double | 0.8 | low | |
ssl.cipher.suites | A list of cipher suites. This is a named combination of authentication, encryption, MAC and key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol. By default all the available cipher suites are supported.
密码套件列表。 这是一种集认证,加密,MAC和密钥交换算法一块的命名组合,用于使用TLS或SSL网络协议协商网络连接的安全设置。 默认情况下,支持所有可用的密码套件。 | list | null | low | |
ssl.endpoint.identification.algorithm | The endpoint identification algorithm to validate server hostname using server certificate.
端点标识算法,使用服务器证书验证服务器主机名。 | string | null | low | |
ssl.keymanager.algorithm | The algorithm used by key manager factory for SSL connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine.
密钥管理器工厂用于SSL连接的算法。 默认值是为Java虚拟机配置的密钥管理器工厂算法。 | string | SunX509 | low | |
ssl.secure.random.implementation | The SecureRandom PRNG implementation to use for SSL cryptography operations.
用于SSL加密操作的SecureRandom PRNG实现。 | string | null | low | |
ssl.trustmanager.algorithm | The algorithm used by trust manager factory for SSL connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine.
ssl链接信任管理者工厂的算法。默认时JVM支持的算法。 | string | PKIX | low |
For those interested in the legacy Scala producer configs, information can be found here.
更多合法的Scala版本的配置,可以查看这里。