Producer Configurations

These topics provides configuration parameters available for Confluent Platform. The parameters are organized by order of importance, ranked from high to low.

key.serializer

Serializer class for key that implements the org.apache.kafka.common.serialization.Serializer interface.

  • Type: class<
  • Importance: high

value.serializer

Serializer class for value that implements the org.apache.kafka.common.serialization.Serializer interface.

  • Type: class<
  • Importance: high

acks

The number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. The following settings are allowed:

  • acks=0 If set to zero then the producer will not wait for any acknowledgment from the server at all. The record will be immediately added to the socket buffer and considered sent. No guarantee can be made that the server has received the record in this case, and the retries configuration will not take effect (as the client won’t generally know of any failures). The offset given back for each record will always be set to -1.
  • acks=1 This will mean the leader will write the record to its local log but will respond without awaiting full acknowledgement from all followers. In this case should the leader fail immediately after acknowledging the record but before the followers have replicated it then the record will be lost.
  • acks=all This means the leader will wait for the full set of in-sync replicas to acknowledge the record. This guarantees that the record will not be lost as long as at least one in-sync replica remains alive. This is the strongest available guarantee. This is equivalent to the acks=-1 setting.
  • Type: string
  • Default: 1
  • Valid Values: [all, -1, 0, 1]
  • Importance: high

bootstrap.servers

A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping&mdash;this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form host1:port1,host2:port2,.... Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).

buffer.memory

The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will block for max.block.ms after which it will throw an exception.

This setting should correspond roughly to the total memory the producer will use, but is not a hard bound since not all memory the producer uses is used for buffering. Some additional memory will be used for compression (if compression is enabled) as well as for maintaining in-flight requests.

  • Type: long
  • Default: 33554432
  • Valid Values: [0,...]
  • Importance: high

compression.type

The compression type for all data generated by the producer. The default is none (i.e. no compression). Valid values are none, gzip, snappy, or lz4. Compression is of full batches of data, so the efficacy of batching will also impact the compression ratio (more batching means better compression).

  • Type: string
  • Default: none
  • Importance: high

retries

Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries without setting max.in.flight.requests.per.connection to 1 will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first.

  • Type: int
  • Default: 0
  • Valid Values: [0,...,2147483647]
  • Importance: high

ssl.key.password

The password of the private key in the key store file. This is optional for client.

  • Type: password
  • Default: null
  • Importance: high

ssl.keystore.location

The location of the key store file. This is optional for client and can be used for two-way authentication for client.

  • Type: string
  • Default: null
  • Importance: high

ssl.keystore.password

The store password for the key store file. This is optional for client and only needed if ssl.keystore.location is configured.

  • Type: password
  • Default: null
  • Importance: high

ssl.truststore.location

The location of the trust store file.

  • Type: string
  • Default: null
  • Importance: high

ssl.truststore.password

The password for the trust store file. If a password is not set access to the truststore is still available, but integrity checking is disabled.

  • Type: password
  • Default: null
  • Importance: high

batch.size

The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration controls the default batch size in bytes.

No attempt will be made to batch records larger than this size.

Requests sent to brokers will contain multiple batches, one for each partition with data available to be sent.

A small batch size will make batching less common and may reduce throughput (a batch size of zero will disable batching entirely). A very large batch size may use memory a bit more wastefully as we will always allocate a buffer of the specified batch size in anticipation of additional records.

  • Type: int
  • Default: 16384
  • Valid Values: [0,...]
  • Importance: medium

client.id

An ID string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.

  • Type: string
  • Default: “”
  • Importance: medium

connections.max.idle.ms

Close idle connections after the number of milliseconds specified by this config.

  • Type: long
  • Default: 540000
  • Importance: medium

linger.ms

The producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out. However in some circumstances the client may want to reduce the number of requests even under moderate load. This setting accomplishes this by adding a small amount of artificial delay&mdash;that is, rather than immediately sending out a record the producer will wait for up to the given delay to allow other records to be sent so that the sends can be batched together. This can be thought of as analogous to Nagle’s algorithm in TCP. This setting gives the upper bound on the delay for batching: once we get batch.size worth of records for a partition it will be sent immediately regardless of this setting, however if we have fewer than this many bytes accumulated for this partition we will ‘linger’ for the specified time waiting for more records to show up. This setting defaults to 0 (i.e. no delay). Setting linger.ms=5, for example, would have the effect of reducing the number of requests sent but would add up to 5ms of latency to records sent in the absence of load.

  • Type: long
  • Default: 0
  • Valid Values: [0,...]
  • Importance: medium

max.block.ms

The configuration controls how long KafkaProducer.send() and KafkaProducer.partitionsFor() will block.These methods can be blocked either because the buffer is full or metadata unavailable.Blocking in the user-supplied serializers or partitioner will not be counted against this timeout.

  • Type: long
  • Default: 60000
  • Valid Values: [0,...]
  • Importance: medium

max.request.size

The maximum size of a request in bytes. This setting will limit the number of record batches the producer will send in a single request to avoid sending huge requests. This is also effectively a cap on the maximum record batch size. Note that the server has its own cap on record batch size which may be different from this.

  • Type: int
  • Default: 1048576
  • Valid Values: [0,...]
  • Importance: medium

partitioner.class

Partitioner class that implements the org.apache.kafka.clients.producer.Partitioner interface.

  • Type: class
  • Default: org.apache.kafka.clients.producer.internals.DefaultPartitioner
  • Importance: medium

receive.buffer.bytes

The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used.

  • Type: int
  • Default: 32768
  • Valid Values: [-1,...]
  • Importance: medium

request.timeout.ms

The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted. This should be larger than replica.lag.time.max.ms (a broker configuration) to reduce the possibility of message duplication due to unnecessary producer retries.

  • Type: int
  • Default: 30000
  • Valid Values: [0,...]
  • Importance: medium

sasl.jaas.config

JAAS login context parameters for SASL connections in the format used by JAAS configuration files. JAAS configuration file format is described <a href=”http://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html“>here</a>. The format for the value is: ‘<loginModuleClass> <controlFlag> (<optionName>=<optionValue>)*;’

  • Type: password
  • Default: null
  • Importance: medium

sasl.kerberos.service.name

The Kerberos principal name that Kafka runs as. This can be defined either in Kafka’s JAAS config or in Kafka’s config.

  • Type: string
  • Default: null
  • Importance: medium

sasl.mechanism

SASL mechanism used for client connections. This may be any mechanism for which a security provider is available. GSSAPI is the default mechanism.

  • Type: string
  • Default: GSSAPI
  • Importance: medium

security.protocol

Protocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.

  • Type: string
  • Default: PLAINTEXT
  • Importance: medium

send.buffer.bytes

The size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used.

  • Type: int
  • Default: 131072
  • Valid Values: [-1,...]
  • Importance: medium

ssl.enabled.protocols

The list of protocols enabled for SSL connections.

  • Type: list
  • Default: TLSv1.2,TLSv1.1,TLSv1
  • Importance: medium

ssl.keystore.type

The file format of the key store file. This is optional for client.

  • Type: string
  • Default: JKS
  • Importance: medium

The SSL protocol used to generate the SSLContext. Default setting is TLS, which is fine for most cases. Allowed values in recent JVMs are TLS, TLSv1.1 and TLSv1.2. SSL, SSLv2 and SSLv3 may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities.

  • Type: string
  • Default: TLS
  • Importance: medium

ssl.provider

The name of the security provider used for SSL connections. Default value is the default security provider of the JVM.

  • Type: string
  • Default: null
  • Importance: medium

ssl.truststore.type

The file format of the trust store file.

  • Type: string
  • Default: JKS
  • Importance: medium

enable.idempotence

When set to ‘true’, the producer will ensure that exactly one copy of each message is written in the stream. If ‘false’, producer retries due to broker failures, etc., may write duplicates of the retried message in the stream. Note that enabling idempotence requires max.in.flight.requests.per.connection to be less than or equal to 5, retries to be greater than 0 and acks must be ‘all’. If these values are not explicitly set by the user, suitable values will be chosen. If incompatible values are set, a ConfigException will be thrown.

  • Type: boolean
  • Default: false
  • Importance: low

interceptor.classes

A list of classes to use as interceptors. Implementing the org.apache.kafka.clients.producer.ProducerInterceptor interface allows you to intercept (and possibly mutate) the records received by the producer before they are published to the Kafka cluster. By default, there are no interceptors.

max.in.flight.requests.per.connection

The maximum number of unacknowledged requests the client will send on a single connection before blocking. Note that if this setting is set to be greater than 1 and there are failed sends, there is a risk of message re-ordering due to retries (i.e., if retries are enabled).

  • Type: int
  • Default: 5
  • Valid Value: [1,...]
  • Importance: low

metadata.max.age.ms

The period of time in milliseconds after which we force a refresh of metadata even if we haven’t seen any partition leadership changes to proactively discover any new brokers or partitions.

  • Type: long
  • Default: 300000
  • Valid Values: [0,...]
  • Importance: low

metric.reporters

A list of classes to use as metrics reporters. Implementing the org.apache.kafka.common.metrics.MetricsReporter interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics.

metrics.num.samples

The number of samples maintained to compute metrics.

  • Type: int
  • Default: 2
  • Valid Values: [1,...]
  • Importance: low

metrics.recording.level

The highest recording level for metrics.

  • Type: string
  • Default: INFO
  • Valid Values: [INFO, DEBUG]
  • Importance: low

metrics.sample.window.ms

The window of time a metrics sample is computed over.

  • Type: long
  • Default: 30000
  • Valid Values: [0,...]
  • Importance: low

reconnect.backoff.max.ms

The maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.

  • Type: long
  • Default: 1000
  • Valid Values: [0,...]
  • Importance: low

reconnect.backoff.ms

The base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.

  • Type: long
  • Default: 50
  • Valid Values: [0,...]
  • Importance: low

retry.backoff.ms

The amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.

  • Type: long
  • Default: 100
  • Valid Values: [0,...]
  • Importance: low

sasl.kerberos.kinit.cmd

Kerberos kinit command path.

  • Type: string
  • Default: /usr/bin/kinit
  • Importance: low

sasl.kerberos.min.time.before.relogin

Login thread sleep time between refresh attempts.

  • Type: long
  • Default: 60000
  • Importance: low

sasl.kerberos.ticket.renew.jitter

Percentage of random jitter added to the renewal time.

  • Type: double
  • Default: 0.05
  • Importance: low

sasl.kerberos.ticket.renew.window.factor

Login thread will sleep until the specified window factor of time from last refresh to ticket’s expiry has been reached, at which time it will try to renew the ticket.

  • Type: double
  • Default: 0.8
  • Importance: low

ssl.cipher.suites

A list of cipher suites. This is a named combination of authentication, encryption, MAC and key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol. By default all the available cipher suites are supported.

  • Type: list
  • Default: null
  • Importance: low

ssl.endpoint.identification.algorithm

The endpoint identification algorithm to validate server hostname using server certificate.

  • Type: string
  • Default: null
  • Importance: low

ssl.keymanager.algorithm

The algorithm used by key manager factory for SSL connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine.

  • Type: string
  • Default: SunX509
  • Importance: low

ssl.secure.random.implementation

The SecureRandom PRNG implementation to use for SSL cryptography operations.

  • Type: string
  • Default: null
  • Importance: low

ssl.trustmanager.algorithm

The algorithm used by trust manager factory for SSL connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine.

  • Type: string
  • Default: PKIX
  • Importance: low

transaction.timeout.ms

The maximum amount of time in ms that the transaction coordinator will wait for a transaction status update from the producer before proactively aborting the ongoing transaction.If this value is larger than the transaction.max.timeout.ms setting in the broker, the request will fail with a InvalidTransactionTimeout error.

  • Type: int
  • Default: 60000
  • Importance: low

transactional.id

The TransactionalId to use for transactional delivery. This enables reliability semantics which span multiple producer sessions since it allows the client to guarantee that transactions using the same TransactionalId have been completed prior to starting any new transactions. If no TransactionalId is provided, then the producer is limited to idempotent delivery. Note that enable.idempotence must be enabled if a TransactionalId is configured. The default is null, which means transactions cannot be used. Note that transactions requires a cluster of at least three brokers by default what is the recommended setting for production; for development you can change this, by adjusting broker setting transaction.state.log.replication.factor.

  • Type: string
  • Default: null
  • Valid Values: non-empty string
  • Importance: low