Troubleshoot Confluent Gateway Docker Deployments

Use this page to collect diagnostics and resolve common issues when running Confluent Gateway with Docker. Start with the log-capture and debug-logging steps that follow, then consult Known and common issues for symptom-based resolutions.

Capture Confluent Gateway logs manually

The Confluent Gateway container logs capture most initialization failures, including network timeouts and credential errors. Inspect them first using the following Docker commands:

# View all Gateway logs
docker logs gateway

# Filter for errors
docker logs gateway | grep -i error

# Filter for license-related messages
docker logs gateway | grep -i license

Enable debug logging to view stack traces

The DEBUG log level captures detailed logs, including stack traces, to help diagnose complex Confluent Gateway issues.

The GATEWAY_ROOT_LOG_LEVEL environment variable controls the global logging level for Confluent Gateway. Update your docker-compose.yaml file to set it to DEBUG:

gateway:
  environment:
    GATEWAY_ROOT_LOG_LEVEL: "DEBUG"

Known and common issues

Confluent Gateway container exits or restarts repeatedly

Symptoms

  • The Confluent Gateway container terminates after startup.

  • Running the docker ps command shows the container is repeatedly restarting.

  • Running the docker compose up command shows that the service failed to start.

  • Logs contain configuration parsing errors, license validation failures, or stack traces.

  • Logs report an unrecognized configuration field, such as UnrecognizedPropertyException: Unrecognized field "tls", which indicates a misspelled or misplaced key.

Likely causes

  • Invalid or incomplete Confluent Gateway configuration file.

  • Misconfigured environment variables in the docker-compose.yaml file.

Resolution

  1. Inspect the logs for configuration and license errors: Identify messages indicating failed YAML parsing, unknown fields, or unrecognized license keys.

    docker logs gateway
    docker logs gateway | grep -i license
    
  2. Verify the configuration source: Confluent Gateway reads its configuration from one of the following, in order of precedence: the GATEWAY_CONFIG environment variable (inline YAML), the GATEWAY_CONFIG_TEMPLATE file, or the GATEWAY_CONFIG_FILE (default /etc/gateway/gateway-config.yaml). If you mount a configuration file, confirm that the volumes section in your docker-compose.yaml file maps it to the expected path in the container.

  3. Validate the configuration file structure:

    • Ensure that the streamingDomains section defines at least one streaming domain and a corresponding bootstrap server.

    • Ensure that the routes section defines at least one valid route definition.

  4. Validate Enterprise mode configuration: If you are using the Enterprise license, confirm that the GATEWAY_LICENSES environment variable contains valid and unexpired license keys.

Clients can’t connect to the Confluent Gateway endpoint

Symptoms

  • Client connections to the Confluent Gateway endpoint either time out or stall indefinitely during metadata requests.

  • Clients can directly connect to the Kafka cluster, but connections fail when routed through the Confluent Gateway.

Likely causes

  • The Confluent Gateway container can’t reach the Kafka bootstrap servers due to DNS, firewall, or network issues.

  • Clients are incorrectly configured to connect to broker addresses directly instead of the Confluent Gateway route endpoint.

  • The routes.endpoint value doesn’t match the host and port published by the Docker container.

  • The bootstrapServerId is mismatched across different sections of the configuration.

Resolution

  1. Confirm Docker port mappings: If the Confluent Gateway route endpoint is defined as gateway.example.com:19092, ensure that your docker-compose.yaml file publishes to that port. For example:

    gateway:
      ports:
        - "19092:19092"
    
  2. Test Gateway to Kafka connectivity:

    • From within the container network or host where Confluent Gateway runs, test connectivity to the Kafka bootstrap endpoints defined in streamingDomains.kafkaCluster.bootstrapServers[].endpoint.

    • Inspect DNS resolution, firewall rules, and network routing.

  3. Review logs for authentication issues: Inspect logs for repeated connection attempts or terminated during authentication messages. If these errors occur, verify your authentication and SSL/TLS configuration settings. For more information on security configuration, see Configure Security for Confluent Private Cloud Gateway.

  4. Verify broker endpoint resolution: When using the host-based broker identification strategy (routes.brokerIdentificationStrategy.type: host), verify that clients resolve individual broker addresses to the Confluent Gateway’s network load balancer (NLB). If your broker address pattern is defined as b$(nodeId).gateway.example.com, ensure that your DNS provider is configured to route these subdomains correctly.

    Required DNS configuration:

    To support dynamic broker identification, you must configure a wildcard DNS record pointing to your Confluent Gateway NLB:

    Record type

    Host or pattern

    Target

    A

    gateway.example.com

    <gateway-nlb-host>

    A (Wildcard)

    *.gateway.example.com

    <gateway-nlb-host>

    Common misconfigurations to avoid:

    Avoid the following settings, which prevent clients from connecting to the brokers:

    • Invalid endpoint binding: route.endpoint = 0.0.0.0

      Result: The Confluent Gateway advertises broker endpoints as localhost or 0.0.0.0 to the client. The client attempts to connect to its own local machine instead of the Confluent Gateway.

    • Unresolvable host patterns: routes.brokerIdentificationStrategy.pattern set to b$(nodeId).<unresolvable-host>

      Result: While the Confluent Gateway might start successfully, clients encounter UnknownHostException because the generated broker addresses don’t exist in your DNS registry.

    • Missing wildcard records: If you only map the root domain (gateway.example.com) without the wildcard (*), individual broker lookups like b0.gateway.example.com fail.

Protocol or SSL/TLS mismatch

Symptoms

  • Client logs or Confluent Gateway logs show errors such as frame size exceptions or TLS handshake failures.

  • Clients use SASL_SSL or SSL but the route is configured as plaintext, or vice versa.

Likely causes

  • The client’s security.protocol doesn’t match the configured security for the route.

  • TLS-encrypted traffic is directed to a plaintext route endpoint, or plaintext traffic is directed to an SSL-only route.

Resolution

  1. Confirm route configuration and restart the Confluent Gateway service to verify that client connectivity is restored.

    • Inspect the gateway.routes[].security.ssl settings in the Confluent Gateway configuration file.

    • If security.ssl is omitted, the route expects plaintext traffic.

    • If security.ssl is defined, configure the client to use an SSL protocol, such as SASL_SSL or SSL.

  2. Align client configuration:

    • For unencrypted connections, set the client protocol to PLAINTEXT.

    • When using SASL authentication without TLS, set the client protocol to SASL_PLAINTEXT.

    • For TLS routing, set the client protocol to SSL or SASL_SSL and configure the truststore and keystore as required.

Authentication swap not behaving as expected

Symptoms

  • Clients using authentication swap fail to authenticate with the backend cluster.

  • Logs report errors related to secret store lookups, Java Authentication and Authorization Service (JAAS) configuration failures, or callback handler initialization.

  • Backend brokers report invalid credentials despite the client providing valid local certificates.

Likely causes

  • Route security isn’t configured with auth: swap.

  • The secretStore name referenced in the Confluent Gateway configuration is incorrect or missing in the routes section.

  • Secret store provider type is incorrect, or the credentials required to access the store are misconfigured.

Resolution

  1. Verify route swap configuration: Ensure the defined routes perform authentication swapping.

    gateway:
      routes:
        - name: <route_name>
          security:
            auth: swap
            swapConfig: <swap-config>
    
  2. Avoid hard-coded broker credentials for cluster authentication: Don’t embed static broker credentials in the Confluent Gateway configuration. For swap scenarios, credentials must be dynamically resolved from a supported provider such as HashiCorp Vault, a local file, or other supported secret stores for dynamic resolution.

  3. Validate your secret store configuration:

    • Provider types are case-sensitive. Verify that you are using the exact casing required. For example, use Azure, not AZURE.

    • Confirm the Confluent Gateway has the necessary permissions to read secrets from the provider.

    • Ensure the Confluent Gateway can reach the secret store’s API endpoint over the network.

    For secret store configuration details, see Configure Security for Confluent Private Cloud Gateway.

OAuth token endpoint not allowed

Symptoms

The Confluent Gateway fails to fetch OAuth tokens, and logs report a security restriction similar to: ... is not allowed. Update system property 'org.apache.kafka.sasl.oauthbearer.allowed.urls'to allow https://...

Likely causes

For security purposes, the Confluent Gateway requires an allowlist of outbound OAuth token requests. The configured OAuth token URL used by the identity provider isn’t included in the default allowlist.

Resolution

  1. Add the OAuth token endpoint to the GATEWAY_OPTS environment variable in the docker-compose.yaml file:

    gateway:
      environment:
        GATEWAY_OPTS: "-Dorg.apache.kafka.sasl.oauthbearer.allowed.urls=<tokenEndpointUri>"
    

    Replace <tokenEndpointUri> with your identity provider’s OAuth token endpoint URL. For example, if you are using Microsoft Entra ID (formerly Azure AD), the token endpoint typically follows the pattern: https://login.microsoftonline.com/<tenantId>/oauth2/v2.0/token.

  2. Restart the Confluent Gateway service to apply the configuration changes. Check the logs to ensure the not allowed error does not appear during the authentication handshake.

License and route limit issues

Symptoms

  • The Confluent Gateway fails to initialize additional routes.

  • Logs report warnings on license or route-limit exhaustion.

Likely causes

Trial mode limits Confluent Gateway to a maximum of four routes. To configure more routes, an Enterprise license is required.

Resolution

  1. Understand license modes: For a comparison of Trial and Enterprise modes, see License modes.

  2. Apply an Enterprise license: To enable more than four routes, add your license key to the GATEWAY_LICENSES environment variable in your docker-compose.yaml file. For configuration details and examples, see Enterprise license configuration (optional).

  3. Inspect the license status: After restarting the service, verify that the Confluent Gateway has successfully recognized the license. Run docker logs gateway | grep -i license to confirm the license status.

Container fails to start on a read-only file system

Symptoms

  • The Confluent Gateway container fails to start, and logs show an error such as Unable to create file /usr/logs/gateway.log or java.io.IOException: Read-only file system.

  • Logs report log4j RollingFileAppender initialization failures.

Likely causes

Confluent Gateway writes log files to /usr/logs and its rendered configuration to /etc/gateway. Running with a read-only root file system, for example with the --read-only flag or a read-only volume mount, blocks these writes and prevents the container from starting.

Resolution

  1. Confirm that the error references a path that Confluent Gateway must write to, such as the log directory (/usr/logs) or the configuration directory (/etc/gateway).

  2. Mount writable volumes for /usr/logs and /etc/gateway. If you run the container with --read-only, add a writable tmpfs or volume mount for these paths.