Log and Network Metrics

This topic describes JMX metrics for Kafka logs and network components. These metrics are useful for monitoring storage and network performance in your Kafka cluster.

For information about how to configure JMX, see Configure JMX for Monitoring.

Search for a metric

Log metrics

The following metrics are available for monitoring logs, log cleaners, and log cleaner managers.

A log directory is a directory on disk that contains one or more Kafka log segments. When a Kafka broker starts up, it registers all of the log directories that it finds on its local disk with the log manager. The log manager is responsible for managing the creation, deletion, and cleaning of log segments across all registered log directories.

cleaner-recopy-percent

MBean: kafka.log:type=LogCleaner,name=cleaner-recopy-percent

A metric to track the recopy rate of each thread’s last cleaning

compacted-partition-bytes

MBean: kafka.log:type=LogCleanerManager,name=compacted-partition-bytes

A gauge metric that provides compacted data in each partition.

compacted-partition-local-bytes

MBean: kafka.log:type=LogCleanerManager,name=compacted-partition-local-bytes

A gauge metric that provides local compacted data in each partition. Confluent Server only.

compacted-partition-tiered-bytes

MBean: kafka.log:type=LogCleanerManager,name=compacted-partition-tiered-bytes

A gauge metric that provides compacted data in each partition when using tiered storage. Confluent Server only.

DeadThreadCount

MBean: kafka.log:type=LogCleaner,name=DeadThreadCount

Provides the number of dead threads for the process.

LogDirectoryOffline

MBean: kafka.log:logDirectory="{/log-directory}":type=LogManager,name=LogDirectoryOffline

A metric that indicates if a log directory is offline (1) or online (0).

LogEndOffset

MBean: kafka.log:type=Log,name=LogEndOffset

The offset of the last message in a partition. Use with LogStartOffset to calculate the current message count for a topic.

LogFlushRateAndTimeMs

MBean: kafka.log:type=LogFlushStats,name=LogFlushRateAndTimeMs

Log flush rate and time in milliseconds.

LogStartOffset

MBean: kafka.log:type=Log,name=LogStartOffset

The offset of the first message in a partition. Use with LogEndOffset to calculate the current message count for a topic.

max-buffer-utilization-percent

MBean: kafka.log:type=LogCleaner,name=max-buffer-utilization-percent

A metric to track the maximum utilization of any thread’s buffer in the last cleaning.

max-clean-time-secs

MBean: kafka.log:type=LogCleaner,name=max-clean-time-secs

A metric to track the maximum cleaning time for the last cleaning from each thread.

max-compaction-delay-secs

MBean: kafka.log:type=LogCleaner,name=max-compaction-delay-secs

A metric that provides the maximum delay, in seconds, that the log cleaner will wait before compacting a log segment.

max-dirty-percent

MBean: kafka.log:type=LogCleanerManager,name=max-dirty-percent

A gauge metric that provides the maximum percentage of the log segment that can contain dirty messages before the log cleaner will begin cleaning it.

NumLogSegments

MBean: kafka.log:type=Log:name=NumLogSegments

The number of log segments that currently exist for a given partition.

OfflineLogDirectoryCount

MBean: kafka.log:type=LogManager,name=OfflineLogDirectoryCount

A metric that describes the number of log directories that are registered with the log manager, but are currently offline.

OffsetIndexAppendTimeMs

MBean: kafka.log:type=SegmentStats,name=OffsetIndexAppendTimeMs

The time in milliseconds to append an entry to the log segment offset index. The offset index maps from logical offsets to physical file positions. Available on Confluent Server only.

remainingLogsToRecover

MBean: kafka.log:type=LogManager,name=remainingLogsToRecover

The number of remaining logs for each log directory to be recovered. This metric provides an overview of the recovery progress for a given log directory.

remainingSegmentsToRecover

MBean: kafka.log:type=LogManager,name=remainingSegmentsToRecover

The number of remaining segments assigned to the currently active recovery thread.

SegmentAppendTimeMs

MBean: kafka.log:type=SegmentStats,name=SegmentAppendTimeMs

The time in milliseconds to append a record to the log segment. Available on Confluent Server only.

Size

MBean: kafka.log:type=Log,name=Size

A metric for the total size in bytes of all log segments that belong to a given partition.

time-since-last-run-ms

MBean: kafka.log:type=LogCleanerManager,name=time-since-last-run-ms

A gauge metric that provides the time, in milliseconds, since the log cleaner last ran.

TimestampIndexAppendTimeMs

MBean: kafka.log:type=SegmentStats,name=TimestampIndexAppendTimeMs

The time in milliseconds to append an entry to the log segment timestamp index. Available on Confluent Server only.

uncleanable-bytes

MBean: kafka.log:logDirectory="{/log-directory}":type=LogCleanerManager:name=uncleanable-bytes

A metric that provides information about the total number of bytes in log segments for a specified log directory that cannot be cleaned by the log cleaner due to retention policy or other configuration settings.

uncleanable-partitions-count

MBean: kafka.log:logDirectory="{/log-directory}":type=LogCleanerManager:name=uncleanable-partitions-count

A gauge metric that tracks the number of partitions marked as uncleanable for each log directory.

Network metrics

Kafka brokers can generate a lot of network traffic because they collect and distribute data for processing. You can use the following metrics to help evaluate whether your Kafka deployment is communicating efficiently.

DeprecatedRequestsPerSec

MBean: kafka.network:type=RequestMetrics,name=DeprecatedRequestsPerSec,request={apiName},version={apiVersion},clientSoftwareName={clientSoftwareName},clientSoftwareVersion={clientSoftwareVersion}

The request rate for deprecated clients in milliseconds. Provides the request API, request API version, client software name and client software version. This is an important metric to track because support for deprecated clients will be removed in Confluent Platform 8.0. Note that this metric is only exposed if a client has used a deprecated API since the server’s last restart.

ErrorsPerSec

MBean: kafka.network:type=RequestMetrics,name=ErrorsPerSec,request={requestType},error={error-code}

Indicates the number of errors in responses counted per-request-type, per-error-code. If a response contains multiple errors, all are counted. error=NONE indicates successful responses. Available for request types: InitProducerId, Metadata, ApiVersions, DescribeCluster, Fetch, DescribeConfigs, FindCoordinator, ListGroups, CreateTopics, UpdateMetadata, ListOffsets, LeaderAndIsr, Produce, AllocateProducerIds.

ExpiredConnectionsKilledCount

MBean: kafka.network:type=SocketServer,name=ExpiredConnectionsKilledCount

The total number of connections disconnected, across all processors, due to a client not re-authenticating and then using the connection beyond its expiration time for anything other than re-authentication. Ideally 0 when re-authentication is enabled, implying there are no longer any older, pre-2.2.0 clients connecting to this broker

LocalTimeMs

MBean: kafka.network:type=RequestMetrics,name=LocalTimeMs,request={Produce|FetchConsumer|FetchFollower}

Time in milliseconds the request is processed at the leader.

MessageConversionsTimeMs

MBean: kafka.network:type=RequestMetrics,name=MessageConversionsTimeMs,request={Produce|Fetch}

Time in milliseconds spent on message format conversions.

NetworkProcessorAvgIdlePercent

MBean: kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent

The average fraction of time the network processor threads are idle. Values are between 0 (all resources are used) and 1 (all resources are available). Ideally this is less than 0.4.

RemoteTimeMs

MBean: kafka.network:type=RequestMetrics,name=RemoteTimeMs,request={requestType}

Time in milliseconds the request waits for the follower. This is non-zero for produce requests when ack=-1 or acks=all. Available for: Produce, FetchConsumer, FetchFollower, TxnOffsetCommit.

RequestBytes

MBean: kafka.network:type=RequestMetrics,name=RequestBytes,request={requestType}

Request size in bytes. Available for request type: OffsetCommit.

RequestQueueSize

MBean: kafka.network:type=RequestChannel,name=RequestQueueSize

Size of the request queue. A congested request queue will not be able to process incoming or outgoing requests.

RequestQueueTimeMs

MBean: kafka.network:type=RequestMetrics,name=RequestQueueTimeMs,request={Produce|FetchConsumer|FetchFollower}

Time in milliseconds that the request waits in the request queue.

RequestsPerSec

MBean: kafka.network:type=RequestMetrics,name=RequestsPerSec,request={requestType},version=([0-9]+)

The request rate. version refers to the API version of the request type. To get the total count for a specific request type, make sure that JMX monitoring tools aggregate across different versions. Available for request types: Produce, FetchConsumer, FetchFollower, DescribeCluster, ListOffsets, Metadata, UpdateMetadata, LeaderAndIsr, Fetch.

ResponseQueueSize

MBean: kafka.network:type=RequestChannel,name=ResponseQueueSize

Size of the response queue. The response queue is unbounded. A congested response queue can result in delayed response times and memory pressure on the broker.

ResponseSendTimeMs

MBean: kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request={Produce|FetchConsumer|FetchFollower}

Time to send the response.

TemporaryMemoryBytes

MBean: kafka.network:type=RequestMetrics,name=TemporaryMemoryBytes,request={RequestType}

Temporary memory, in bytes, used for message format conversions and decompression. Available for request types: Produce, Fetch.

ThrottleTimeMs

MBean: kafka.network:type=RequestMetrics,name=ThrottleTimeMs,request={RequestType}

The time in milliseconds that requests are throttled due to quota enforcement. Available for request types: Produce, Fetch, RequestQuotas, OffsetFetch.

TotalTimeMs

MBean: kafka.network:type=RequestMetrics,name=TotalTimeMs,request={requestType}

The total time in milliseconds to serve the specified request. Available for request types: DescribeCluster, JoinGroup, LeaveGroup, ListOffsets, Metadata, OffsetCommit, AddOffsetsToTxn, AddPartitionsToTxn, DescribeTransactions, EndTxn, ListTransactions, WriteTxnMarkers, UpdateFeatures, LeaderAndIsr.