kafka文档（12）－－－－0.10.1－Document－文档（4）－configures－producer配置信息

最新推荐文章于 2022-05-19 10:14:51 发布

录事参军

最新推荐文章于 2022-05-19 10:14:51 发布

阅读量1.2k

点赞数

分类专栏： kafka kafka 文章标签： kafka

kafka 同时被 2 个专栏收录

18 篇文章 0 订阅

订阅专栏

kafka

17 篇文章 49 订阅

订阅专栏

3.2 Producer Configs

Below is the configuration of the Java producer:

下面是java版本的producer的配置文件

NAME	DESCRIPTION	TYPE	DEFAULT	VALID VALUES	IMPORTANCE
bootstrap.servers	A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form`host1:port1,host2:port2,...`. Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down). host/port对的列表，用来建立与kafka的初始链接。客户端将使用列表中所有指定的servers－这个列表只影响客户端的初始化，客户端需要使用这个列表去查询所有servers的完整列表。列表格式应该为：host1:port1,host2,port2,....；因为这些server列表只是用来初始化发现完整的server列表（而完整的server列表可能在使用中发生变化，机器损坏，部署迁移等），这个表不需要包含所有server的ip和port（但是最好多于1个，预防这个server挂掉的风险，防止下次启动无法链接）	list			high
key.serializer	Serializer class for key that implements the`Serializer` interface Serializer接口的密钥的类的key	class			high
value.serializer	Serializer class for value that implements the`Serializer` interface Serializer接口的类的value	class			high
acks	The number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. The following settings are allowed: producer要求在leader在判定某条消息是否写入完成之前需要收到的确认写入的个数。这个值控制了发送消息的可用性。以下为具体配置说明： `acks=0` If set to zero then the producer will not wait for any acknowledgment from the server at all. The record will be immediately added to the socket buffer and considered sent. No guarantee can be made that the server has received the record in this case, and the`retries` configuration will not take effect (as the client won't generally know of any failures). The offset given back for each record will always be set to -1. ack＝0，如果设置为0，则表明producer不要等待server的任何写入确认。记录会立刻添加到socket buffer，然后认为是发送了。这种情况下，无法保证server是否确实收到了消息，同时retries这个配置不起作用，请求返回应答中的offset通常设置为－1 `acks=1` This will mean the leader will write the record to its local log but will respond without awaiting full acknowledgement from all followers. In this case should the leader fail immediately after acknowledging the record but before the followers have replicated it then the record will be lost.只要leader确认写入本地日志了就可以返回应答了，不需要等待所有follower同步成功。这种情况下，如果leader写入本地之后立马返回确认应答，但是此时follower没有同步这条消息，同时leader如果挂掉，则这条消息丢失了 `acks=all` This means the leader will wait for the full set of in-sync replicas to acknowledge the record. This guarantees that the record will not be lost as long as at least one in-sync replica remains alive. This is the strongest available guarantee. This is equivalent to the acks=-1 setting.这种情况下要求leader在收到所有活跃备份节点确认写入的消息之后才能回馈确认写入的消息给producer。这种方式可以保证只要有一个备份节点活跃，消息就不会丢。这是最强的保证。这个和acks＝－1相同	string	1	[all, -1, 0, 1]	high
buffer.memory	The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will block for `max.block.ms` after which it will throw an exception. This setting should correspond roughly to the total memory the producer will use, but is not a hard bound since not all memory the producer uses is used for buffering. Some additional memory will be used for compression (if compression is enabled) as well as for maintaining in-flight requests. producer用于缓存发送数据的内存大小。如果消息放入缓存的速度大于发送的速度，则producer可以设置阻塞超时时间max.block.ms，超时则报异常即可。这个设置指定了producer将要使用的内存大小，但是并不是一个实际的边界条件，因为producer并不会把所有的内存都用作缓存。一些额外的缓存可能用于压缩（如果支持压缩的话），还有一些缓存用于维护正在进行的请求。	long	33554432	[0,...]	high
compression.type	The compression type for all data generated by the producer. The default is none (i.e. no compression). Valid values are `none`, `gzip`, `snappy`, or `lz4`. Compression is of full batches of data, so the efficacy of batching will also impact the compression ratio (more batching means better compression). producer可以支持的数据压缩类型。合法的压缩格式为：none，gzip，snappy，lz4.压缩时批量进行的，因此批量的大小也会影响压缩的效率（更大的批量可能会有更高的压缩速率）	string	none		high
retries	Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries without setting`max.in.flight.requests.per.connection` to 1 will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first. 设置重试次数可以在发送失败时进行重试，提高发送的可靠性。注意，这个重试和客户端发生接受错误的重试没有区别。允许重试，而不设置max.in.flight.request.per.connection为1的话，将可能调整消息的发送次序，例如两组批量消息发送到同一个partition，第一个失败了然后重试，但是第二个发送成功了，实际的结果可能是第二个小组在partition中出现的更早。	int	0	[0,...,2147483647]	high
ssl.key.password	The password of the private key in the key store file. This is optional for client. key存储文件中私有密钥的密码。对客户端来说是可选的。	password	null		high
ssl.keystore.location	The location of the key store file. This is optional for client and can be used for two-way authentication for client. 密钥存储的位置。对于客户端来说是可选的，可以使用双向认证	string	null		high
ssl.keystore.password	The store password for the key store file. This is optional for client and only needed if ssl.keystore.location is configured. 密钥文件中密码，这个队客户端来说是可选的，只有在ssl.keystore.location配置的时候才需要	password	null		high
ssl.truststore.location	The location of the trust store file. 信任存储文件的位置	string	null		high
ssl.truststore.password	The password for the trust store file. 受信任文件的密码	password	null		high
batch.size	The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration controls the default batch size in bytes. No attempt will be made to batch records larger than this size. Requests sent to brokers will contain multiple batches, one for each partition with data available to be sent. A small batch size will make batching less common and may reduce throughput (a batch size of zero will disable batching entirely). A very large batch size may use memory a bit more wastefully as we will always allocate a buffer of the specified batch size in anticipation of additional records. producer尝试批量处理消息，可以使用较少的发送次数发送相同数量的消息。可以提高server以及client端的性能。此值是指默认批量处理的字节数。不要尝试批量发送超过这个值的消息。发送给brokers的请求可能包含多个批量发送的消息组，每个组都有对应的partition。较小的批量发送尺寸将降低吞吐量（批量尺寸为0的话将禁止批量发送）。非常大的批量发送尺寸将需要更多的空间，需要预先申请更大的空间。	int	16384	[0,...]	medium
client.id	An id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging. 请求server时传送给server的clientid字符串。目的在于追踪请求的来源，判断是否从合法ip/port发出的。	string	""		medium
connections.max.idle.ms	Close idle connections after the number of milliseconds specified by this config. 空闲链接存在的最长时间，超出之后就会被关闭	long	540000		medium
linger.ms	The producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out. However in some circumstances the client may want to reduce the number of requests even under moderate load. This setting accomplishes this by adding a small amount of artificial delay—that is, rather than immediately sending out a record the producer will wait for up to the given delay to allow other records to be sent so that the sends can be batched together. This can be thought of as analogous to Nagle's algorithm in TCP. This setting gives the upper bound on the delay for batching: once we get`batch.size` worth of records for a partition it will be sent immediately regardless of this setting, however if we have fewer than this many bytes accumulated for this partition we will 'linger' for the specified time waiting for more records to show up. This setting defaults to 0 (i.e. no delay). Setting `linger.ms=5`, for example, would have the effect of reducing the number of requests sent but would add up to 5ms of latency to records sent in the absense of load. 生产者将批量处理请求，只要落在两次批量请求之间的请求都被集合到一次请求中。通常来说，这只会在以下情况下发生：即请求来的比发送的频繁。然而，即使在中等负载情况下，producer也希望降低请求的次数。该设置通过添加人为的延迟实现这一点：即，请求来到不是立即发送，而是等待指定的延迟，批量发送请求。这类似于TCP中的Nagle算法，此值指出了延迟的上限，一旦我们的请求数量达到一个分区的batch.size，可以立即发送请求而不用管这个值。但是，如果分区的请求的数量没有达到batch.size，则需要延迟此值指定的时间，以等待更多的请求。默认设置为0，即没有延迟。设置linger.ms＝5，例如，为了降低请求次数，可能需要等待5ms的时间以获取更多的请求进行批次请求，但是如果在请求数不多的情况下进行延迟，会导致延迟5ms。	long	0	[0,...]	medium
max.block.ms	The configuration controls how long`KafkaProducer.send()` and`KafkaProducer.partitionsFor()` will block.These methods can be blocked either because the buffer is full or metadata unavailable.Blocking in the user-supplied serializers or partitioner will not be counted against this timeout. KafkaProducer.send（）和KafkaProducer.partitionsFor（）将会阻塞的时长。这两个方法有可能因为缓存区满了或者元数据不可用而阻塞。由于用户提供的serializers或者partitioner而产生的阻塞不会计入超时。	long	60000	[0,...]	medium
max.request.size	The maximum size of a request in bytes. This is also effectively a cap on the maximum record size. Note that the server has its own cap on record size which may be different from this. This setting will limit the number of record batches the producer will send in a single request to avoid sending huge requests. 请求的最大字节数。这个也是有效的最长消息的上限。注意，server可能有自己的消息上限，相互之间可能有所不同。这个设置限制了批量处理消息的大小，因此producer单次发送时应该避免发送一大坨请求。	int	1048576	[0,...]	medium
partitioner.class	Partitioner class that implements the `Partitioner`interface. 实现Partitioner的接口的Partitioner类	class	class org.apache.kafka.clients.producer.internals.DefaultPartitioner		medium
receive.buffer.bytes	The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used TCP接受缓存的大小（SO_RCVBUF）。如果设置为－1，则使用OS默认值.	int	32768	[-1,...]	medium
request.timeout.ms	The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted. 请求的超时时间，即等待server端应答某个请求的最长时间。如果在超时时间内没有收到应答，客户端有可能重试，如果重试都失败了，则本次请求失败。	int	30000	[0,...]	medium
sasl.kerberos.service.name	The Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config. kafka运行的Kerberos主机名。可以在Kafka's JAAS配置或者Kafka's 配置中定义。	string	null		medium
sasl.mechanism	SASL mechanism used for client connections. This may be any mechanism for which a security provider is available. GSSAPI is the default mechanism. 客户端链接进行通信的SASL机制。默认时GSSAPI	string	GSSAPI		medium
security.protocol	Protocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL. brokers之间通信使用的安全协议。正确值为：PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.	string	PLAINTEXT		medium
send.buffer.bytes	The size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used. TCP发送的socket的SO_SNDBUF缓存。如果设置为－1，将使用OS的默认值	int	131072	[-1,...]	medium
ssl.enabled.protocols	The list of protocols enabled for SSL connections. SSL链接的协议	list	[TLSv1.2, TLSv1.1, TLSv1]		medium
ssl.keystore.type	The file format of the key store file. This is optional for client. 密钥文件的文件格式。对客户端来说是可选的。	string	JKS		medium
ssl.protocol	The SSL protocol used to generate the SSLContext. Default setting is TLS, which is fine for most cases. Allowed values in recent JVMs are TLS, TLSv1.1 and TLSv1.2. SSL, SSLv2 and SSLv3 may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities. 生成SSLContext的SSL协议。默认配置时TLS，适用于大部分情况。最近JVMS支持的协议包括：TLS，TLSv1.1，TLSv1.2. SSL，SSLv2，SSLv3在老版本的JVMS中可用，但是由于知名的安全漏洞，它们并不受欢迎。	string	TLS		medium
ssl.provider	The name of the security provider used for SSL connections. Default value is the default security provider of the JVM. SSL链接安全提供者名字。默认是JVM	string	null		medium
ssl.truststore.type	The file format of the trust store file. 受信任的文件的文件格式	string	JKS		medium
timeout.ms	The configuration controls the maximum amount of time the server will wait for acknowledgments from followers to meet the acknowledgment requirements the producer has specified with the `acks` configuration. If the requested number of acknowledgments are not met when the timeout elapses an error will be returned. This timeout is measured on the server side and does not include the network latency of the request. server等待followers回馈的写入确认个数（producer指定的acks）的超时时间，如果要求的确认个数在超时时间内没有达到，则会返回错误。这个超时是指server端的超时，并且没有包含请求的网络延迟。	int	30000	[0,...]	medium
block.on.buffer.full	When our memory buffer is exhausted we must either stop accepting new records (block) or throw errors. By default this setting is false and the producer will no longer throw a BufferExhaustException but instead will use the`max.block.ms` value to block, after which it will throw a TimeoutException. Setting this property to true will set the `max.block.ms` to Long.MAX_VALUE. Also if this property is set to true, parameter`metadata.fetch.timeout.ms` is no longer honored. This parameter is deprecated and will be removed in a future release. Parameter `max.block.ms` should be used instead. 当内存已满，要么停止接受新请求，要么抛出错误。默认时抛出错误，而且producer不再抛出BufferExhaustException，而是阻塞max.block.ms长的时间，阻塞超时后会抛出TimeoutException错误。设置这个值为true，将设置max.block.ms为long.MAX_VALUE，而且不再使用metadata.fetch.timeout.ms。此参数在以后的release版本中会废弃，可以使用max.block.ms	boolean	false		low
interceptor.classes	A list of classes to use as interceptors. Implementing the`ProducerInterceptor` interface allows you to intercept (and possibly mutate) the records received by the producer before they are published to the Kafka cluster. By default, there are no interceptors. 用作拦截器的类的列表。接口ProducerInterceptor可以拦截部分消息，以防它们发送到kafka集群。默认情况下没有拦截器	list	null		low
max.in.flight.requests.per.connection	The maximum number of unacknowledged requests the client will send on a single connection before blocking. Note that if this setting is set to be greater than 1 and there are failed sends, there is a risk of message re-ordering due to retries (i.e., if retries are enabled). 在阻塞之前，客户端可以在单个链接之中发送未确认的请求的最大数目。注意：如果这个值大于1，则一旦发送失败，有可能会打乱消息的原有顺序	int	5	[1,...]	low
metadata.fetch.timeout.ms	The first time data is sent to a topic we must fetch metadata about that topic to know which servers host the topic's partitions. This config specifies the maximum time, in milliseconds, for this fetch to succeed before throwing an exception back to the client. 获取某个topic的partitions在servers上分布情况的元数据的超时时间。此值指定了最大时间，用来指定客户端等待server回馈元数据的时间。	long	60000	[0,...]	low
metadata.max.age.ms	The period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions. 更新metadata的时间间隔，无论partition的leader是否发生变换或者topic其它的元数据是否发生变化。	long	300000	[0,...]	low
metric.reporters	A list of classes to use as metrics reporters. Implementing the `MetricReporter` interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics. 用于实现指标统计的类的列表。MetricReporter接口允许调用实现指标统计的插件类。JmxReporter总是包含注册JMX统计。	list	[]		low
metrics.num.samples	The number of samples maintained to compute metrics. 维护计算指标的样本数	int	2	[1,...]	low
metrics.sample.window.ms	The window of time a metrics sample is computed over. 度量样本的计算的时长	long	30000	[0,...]	low
reconnect.backoff.ms	The amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all requests sent by the consumer to the broker. 重连给定host之前的等待时间。避免频繁的重连某个host。这个backoff时间也设定了consumer请求broker的重试等待时间。	long	50	[0,...]	low
retry.backoff.ms	The amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios. 重新发送失败请求的等待时间。避免某些失败情况下频繁发送请求。	long	100	[0,...]	low
sasl.kerberos.kinit.cmd	Kerberos kinit command path. Kerberos kinit命令路径	string	/usr/bin/kinit		low
sasl.kerberos.min.time.before.relogin	Login thread sleep time between refresh attempts. 重试之间，线程的睡眠时间	long	60000		low
sasl.kerberos.ticket.renew.jitter	Percentage of random jitter added to the renewal time. 添加到更新时间的随机抖动的百分比。	double	0.05		low
sasl.kerberos.ticket.renew.window.factor	Login thread will sleep until the specified window factor of time from last refresh to ticket's expiry has been reached, at which time it will try to renew the ticket. 重新进行登录验证刷新之前，登录线程的睡眠时间	double	0.8		low
ssl.cipher.suites	A list of cipher suites. This is a named combination of authentication, encryption, MAC and key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol. By default all the available cipher suites are supported. 密码套件列表。这是一种集认证，加密，MAC和密钥交换算法一块的命名组合，用于使用TLS或SSL网络协议协商网络连接的安全设置。默认情况下，支持所有可用的密码套件。	list	null		low
ssl.endpoint.identification.algorithm	The endpoint identification algorithm to validate server hostname using server certificate. 端点标识算法，使用服务器证书验证服务器主机名。	string	null		low
ssl.keymanager.algorithm	The algorithm used by key manager factory for SSL connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine. 密钥管理器工厂用于SSL连接的算法。默认值是为Java虚拟机配置的密钥管理器工厂算法。	string	SunX509		low
ssl.secure.random.implementation	The SecureRandom PRNG implementation to use for SSL cryptography operations. 用于SSL加密操作的SecureRandom PRNG实现。	string	null		low
ssl.trustmanager.algorithm	The algorithm used by trust manager factory for SSL connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine. ssl链接信任管理者工厂的算法。默认时JVM支持的算法。	string	PKIX		low