Troubleshoot SSO for Control Center (Legacy) on Confluent Platform
Important
To use SSO with Confluent Control Center (Legacy) your installation must use Confluent Platform version 7.5 or later. SSO for Confluent Control Center (Legacy) using OIDC cannot be used with both on-premises Confluent Platform clusters where your Confluent Control Center (Legacy) is self-managed, and Confluent Cloud clusters, which use SAML for SSO.
If you are having trouble with SSO using OIDC, review the following common issues encountered during configuration and possible solutions.
If you are still having trouble after reviewing the issues and solutions, contact Confluent Support.
Common issues
Click the following links to jump to sections that might be relevant to you:
Misconfiguration of identity provider endpoints in Confluent Platform
Misconfiguration of client credentials in Confluent Platform
Misconfiguration of redirect callback URLs (to Confluent Platform) in the identity provider
Connectivity issues between the identity provider and Confluent Platform
Misconfiguration of identity provider endpoints in Confluent Platform
The following are common misconfigurations of identity provider endpoints in the Confluent Platform cluster for endpoint configurations: Authorize, Token, Issuer, and JWKS.
Token endpoint misconfiguration
Problem
Confluent Platform is unable to request a token from the identity provider because the token endpoint is misconfigured.
Here’s an example of the error message:
/* Example error when token endpoint is misconfigured */
{
"status_code": 500,
"message": "java.lang.RuntimeException: Got bad request status from IdP: {\"error\":\"invalid_request\",\"error_description\":\"AADSTS900023: Specified tenant identifier '0893715bxxx-959b-4906-a185-2789e1ead045' is neither a valid DNS name, nor a valid external domain.\\r\\nTrace ID: 907463ce-d595-4dcc-a89b-4098d5d61600\\r\\nCorrelation ID: f8c6cf9c-938d-411a-9ccc-de6d69114e03\\r\\nTimestamp: 2023-06-20 09:31:58Z\",\"error_codes\":[900023],\"timestamp\":\"2023-06-20 09:31:58Z\",\"trace_id\":\"907463ce-d595-4dcc-a89b-4098d5d61600\",\"correlation_id\":\"f8c6cf9c-938d-411a-9ccc-de6d69114e03\",\"error_uri\":\"https://login.microsoftonline.com/error?code=900023\"}"
}
Solution
The token configuration needs to be fixed.
Ansible Playbooks for Confluent Platform:
sso_token_uriConfluent for Kubernetes:
tokenBaseEndpointUri
Additional details
Use OpenID Connect metadata discovery URIs to verify the correct endpoints for your identity provider. To get details on the endpoints for your identity provider, check their documentation. See 1 - Establish trust between the IdP and Confluent Platform for more information about endpoints for Okta, Microsoft Entra ID (Azure Active Directory), and Keycloak.
The error is visible on the web browser during authentication and also is output to the MDS log files for your Confluent Platform cluster. By default, the log location is
/tmp/kafka-logs, but you can modify the location using thelog.dirsvalue in your broker configuration files or using the Java system property while starting the Confluent Platform cluster with-Dkafka.logs.dir.
Issuer endpoint misconfiguration
Problem
Confluent Platform is unable to verify the token authenticity because the token issuer endpoint is incorrectly configured.
Here’s an example of the error message:
/* Example error when issuer is misconfigured */
{
"status_code": 403,
"message": "org.jose4j.jwt.consumer.InvalidJwtException: JWT (claims->{\"aud\":\"429a995a-de64-469a-b11d-69ca4344fdc2\",\"iss\":\"https://login.microsoftonline.com/0893715b-959b-4906-a185-2789e1ead045/v2.0\",\"iat\":1687245493,\"nbf\":1687245493,\"exp\":1687249393,\"groups\":[\"99b49608-15fc-48f8-a7c7-d1d4d7ff03de\",\"e8c31aa4-be6f-4d92-b888-7d595dc3f42e\"],\"rh\":\"0.ARsAW3GTCJuVBkmhhSeJ4erQRUozdOYN3ihLqe2mKChc58QbAHs.\",\"sub\":\"uW5lpf2zSoJ9K6O6hruUnx4LulcNUGoKR_viwsw010w\",\"tid\":\"0893715b-959b-4906-a185-2789e1ead045\",\"uti\":\"TZeBoX7yCE2LiIFqZyYVAA\",\"ver\":\"2.0\",\"wids\":[\"b79fbf4d-3ef9-4689-8143-76b194e85509\"]}) rejected due to invalid claims or other invalid content. Additional details: [[12] Issuer (iss) claim value (https://login.microsoftonline.com/0893715b-959b-4906-a185-2789e1ead045/v2.0) doesn't match expected value of https://login.microsoftonline.com/]",
"type": "CLIENT_ERROR"
}
Solution
The Issuer configuration needs to be fixed.In CP-Ansible, its sso_issuer_url. In CFK, it’s issuer.
Additional details
Use OpenID Connect metadata discovery URIs to verify the correct endpoints for your identity provider. To get details on the endpoints for your identity provider, check their documentation. See 1 - Establish trust between the IdP and Confluent Platform for more information about endpoints for Okta, Microsoft Entra ID (Azure Active Directory), and Keycloak.
The error is visible on the web browser during authentication and also is output to the MDS log files for your Confluent Platform cluster. By default, the log location is
/tmp/kafka-logs, but you can modify the location using thelog.dirsvalue in your broker configuration files or using the Java system property while starting the Confluent Platform cluster with-Dkafka.logs.dir.
JWKS endpoint misconfiguration
Problem
Confluent Platform is unable to verify the authenticity of the token because the JWKS URI is misconfigured or, if the keys used to verify the token are expired or not updated in the identity provider.
Here’s an example of the error message:
/* Example error when JWKS uri does not contain keys required to verify JWT */
{
"status_code": 403,
"message": "org.jose4j.jwt.consumer.InvalidJwtException: JWT processing failed. Additional details: [[17] Unable to process JOSE object (cause: org.jose4j.lang.UnresolvableKeyException: Unable to find a suitable verification key for JWS w/ header ........",
"type": "CLIENT_ERROR"
}
Solution
The JWKS configuration needs to be fixed.
Ansible Playbooks for Confluent Platform:
sso_jwks_uriConfluent for Kubernetes:
jwksEndpointUri
Also, check to see if the signing keys are updated and not expired.
Misconfiguration of client credentials in Confluent Platform
The following are common misconfigurations of client credentials in the Confluent Platform cluster for client configurations: Client ID and Client Secret.
Client ID misconfiguration
Problem
You see a 400 Bad Request error during authentication. The identity provider
is unable to recognize the client application for Confluent Platform because the client ID is
misconfigured in the request.
Here’s an example of the error message:
In the browser window, you see a 400 Bad Request error and an error message like this:
"Your request resulted in an error."
Solution
Fix the client ID value. For example, Ansible Playbooks for Confluent Platform, fix the value of sso_client_id
in the inventory file.
Additional details
The error is visible on the web browser during authentication and also is output
to the MDS log files for your Confluent Platform cluster. By default, the log location is
/tmp/kafka-logs, but you can modify the location using the log.dirs
value in your broker configuration files or using the Java system property while
starting the Confluent Platform cluster with -Dkafka.logs.dir.
Client secret misconfiguration
Problem
The identity provider is unable to authenticate because the client secret configured is incorrect.
Here’s an example of an error message when the client secret is wrong for Microsoft Entra ID (Azure Active Directory):
java.lang.RuntimeException: Failed to retrieve tokens from IDP with status: 401. Error: {"error":"invalid_client","error_description":"AADSTS7000215: Invalid client secret provided. Ensure the secret being sent in the request is the client secret value, not the client secret ID, for a secret added to app '429a995a-de64-469a-b11d-69ca4344fdc2'.
Trace ID: bfb6ad53-766f-440d-ab71-7d1cbe6a1100
Correlation ID: f7ba524c-038f-4009-aa80-099730a8f79c
Timestamp: 2023-06-20 08:16:23Z","error_codes":[7000215],"timestamp":"2023-06-20 08:16:23Z","trace_id":"bfb6ad53-766f-440d-ab71-7d1cbe6a1100","correlation_id":"f7ba524c-038f-4009-aa80-099730a8f79c","error_uri":"https://login.microsoftonline.com/error?code=7000215"}
Solution
Fix the client secret value. For example, in Ansible Playbooks for Confluent Platform, it’s the value of sso_client_password
in the inventory file.
Additional details
The error is visible on the web browser during authentication and also is output
to the MDS log files for your Confluent Platform cluster. By default, the log location is
/tmp/kafka-logs, but you can modify the location using the log.dirs
value in your broker configuration files or using the Java system property
while starting the Confluent Platform cluster with -Dkafka.logs.dir.
Misconfiguration of redirect callback URLs (to Confluent Platform) in the identity provider
The following are common misconfigurations when the identity provider is unable to communicate with Confluent Platform or Confluent Control Center (Legacy) because the redirect URI is incorrect.
Redirect URI misconfiguration
Problem
The redirect URI in the client application is misconfigured.
Here’s an example of the error message:
The browser window displays a 400 Bad Request and an error message like this:
“Error: The redirect_uri parameter must be a Login Redirect URI in the
client app settings.”
Solution
Make sure that the redirect_uri you’re using in your authentication request
is exactly the same as the URI you’ve set up in your client application settings
at the identity provider.
Update the redirect URI in the client application with:
https://<c3-host-name>:<c3-port-number>/api/metadata/security/1.0/oidc/authorization-code/callback
Replace the placeholders with the Confluent Control Center (Legacy) hostname and port number.
For guidance on how to update, the following articles might be helpful:
Microsoft Entra ID (Azure Active Directory): See How to change redirect_uri for Azure AD [Stack Overflow]
Misconfiguration of claims
The following are common issues with either the subject (sub) claim name or
the groups (groups) claim name in configurations.
Subject (sub) claim misconfiguration
Problem
Confluent Platform is unable to use the token provisioned by the identity provider because
The
subclaim is missing in the identity token.The value of
subclaim is empty or null.The value of
subclaim is not interpretable. For example, the identity token contains an array of strings for thesubclaim value instead of a simple string. Thesubclaim value should be a string.
Here’s an example of an error message when the sub claim is missing in the
identity token:
{
"status_code": 403,
"message": "io.confluent.tokenapi.exceptions.InvalidTokenException: myclaim(sub claim) not present",
"type": "CLIENT_ERROR"
}
Solution
The configured
subclaim name in the Ansible Playbooks for Confluent Platform or Confluent for Kubernetes inventory file should be a interpretable unique key to identify the user.Check the identity provider to use the correct claim name to uniquely identify the user. And, accordingly configure
sso_sub_claimin Ansible Playbooks for Confluent Platform orsubClaimNamein Confluent for Kubernetes.
Groups (groups) claim misconfiguration
Problem
The groups claim name is configured incorrectly.
Here’s an example of the error message:
{
"error_type": "TypeMismatch",
"message": "groups is not a List. Actual type: String"
}
Solution
Verify in the identity provider for the correct claim name to get all the groups a user is a member of.
Ansible Playbooks for Confluent Platform: Change the value of
sso_groups_claim.Confluent for Kubernetes: Change the value of
groupsClaimName.
The
groupsclaim value in the identity token should be a list of groups the user is a member of.The claim name in Confluent Platform configurations in Ansible Playbooks for Confluent Platform and Confluent for Kubernetes is configured as
groupsby default.
Session management problems
The following are common issues with session management where:
The refresh tokens are not enabled in the identity provider for the Confluent Platform client application.
The refresh token expiration is configured as too low on the identity provider.
The ID token expiration or session renewal interval values are too low.
Unexpected session behaviors because refresh token is not enabled in the identity provider
ID token expiration (included in the identity token provisioned by the identity provider) |
Renewal interval ( |
Absolute session expiration ( |
Session gets renewed after 80% of session token expiry limit is past |
Session cannot be extended after X mins and the user needs to login again |
|---|---|---|---|---|
1440 minutes |
15 minutes |
360 minutes |
12 minutes, 24 minutes, … But new additional role assignments or user state changes won’t be reflected. |
X = 360 minutes |
240 minutes |
15 minutes |
360 minutes |
12 minutes, 24 minutes, … But new additional role assignments or user state changes won’t be reflected. |
X = 240 minutes |
If you find that your modifications are causing issues, revert to the default values, which should work for most cases.
Session renewal interval (
confluent.oidc.session.token.expiry.ms)Default value:
900000milliseconds (15 minutes)
Absolute session expiration (
confluent.oidc.session.max.timeout.ms)Default value:
21600000milliseconds (360 minutes or 6 hours)
Additional details
When refresh tokens are not enabled in the identity provider for the client application, the setup might even fail with
403,404, or500HTTP request errors in some identity providers.Even if the setup is working, the role bindings would not be updated during session renewal.
When the refresh token is unavailable, the absolute session expiration limit primarily affects the time to enforce the re-login of the user.
Refresh token is invalid or expired
Problem
The refresh token is enabled, but the expiration is configured too low.
Here’s an example of an error message:
java.lang.RuntimeException: Got bad request status from IdP:
{
"error": "invalid_grant",
"error_description": "The refresh token is invalid or expired."
}
Solution
Verify on the identity provider and correct.
Connectivity issues between the identity provider and Confluent Platform
The following are common issues with connectivity between the identity provider and Confluent Platform:
The identity provider is down or unreachable.
The identity provider is reachable, but the client application is deleted or deactivated in the identity provider.
When there is any other network connectivity issue between Confluent Platform and the identity provider because of restricting firewall rules or other reasons.
Identity provider is unreachable or down
Problem
The identity provider is unreachable from your Confluent Platform cluster.
Here are a couple of examples of the error messages:
{
"status_code": 500,
"message": "java.lang.RuntimeException: Failed to retrieve tokens from IDP with status:404"
}
{
"status_code": 500,
"message": "javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused (Connection refused)"
}
Solution
Check if there is a healthy network connection between the Confluent Platform cluster and the identity provider.
Check for blocking firewall rules or other network restrictions between the Confluent Platform cluster and the identity provider.
Additional details
The error is visible on the web browser during authentication and also is output
to the MDS log files for your Confluent Platform cluster. By default, the log location is
/tmp/kafka-logs, but you can modify the location using the log.dirs
value in your broker configuration files or using the Java system property
while starting the Confluent Platform cluster with -Dkafka.logs.dir.
Client application unreachable
Problem
You see an invalid_request error from the identity provider because the
client application is unreachable.
Here’s an example of an error message:
{
"status_code": 500,
"message": "invalid_request"
}
Solution
Check in the identity provider if the application is available and activated.