Skip to content

Azure AD - An error occurred logging in Server error while authenticating

This document (000021599) is provided subject to the disclaimer at the end of this document.

Environment

  • Rancher versions: 2.9.1, 2.9.2, 2.9.3
  • Auth provider: Azure AD

Situation

When upgrading to Rancher 2.9.1 or 2.9.2 and use Azure AD as your main auth provider to login, after a certain amount of time users will be unable to  login and will receive the following message:

An error occurred logging in Server error while authenticating

The reason is because in the local Rancher cluster, there is a secret called `azuread-access-token` in the `cattle-global-data` namespace that appends user login information whenever a user logs in. Over time, the secret will grow in size till eventually reaching Kubernetes max secret size: 1MB or 1048576 bytes.

Note: The secret can reach over this limit, and when it does, that's when we start to see users not able to login to Rancher. To verify it's size you can run a couple of commands:

kubectl get secret azuread-access-token -n <namespace> -o jsonpath="{.data}" | base64 -d | wc -c

or

kubectl describe secret azuread-access-token -n cattle-global-data | grep bytes

Resolution

As a workaround, remove the azuread-access-token in the cattle-global-data namespace. Once deleted, verify that the secret is indeed deleted. The secret will get recreated when a user logs back into Rancher. And the size of the secret should decrease.

In the official patch, we changed the behavior to create a new client for every token authentication which doesn't use the cache. This patch will be included in Rancher 2.10 as well as backported to 2.9 more specifically in >=2.9.4:

Cause

The cause is due to the azuread-access-token filling up with user login information, till eventually hitting Kubernetes max limit. The Azure client login was using the access token cache, which led to additional tokens being cached.

Status

Reported to Engineering

Additional Information

In the local Rancher cluster, more specifically the Rancher pods, here is an example of errors that you may encounter in the logs:


[ERROR] API error response 500 for POST /v3-public/azureADProviders/azuread?action=login. Cause: getting OID from AuthCode: error updating secret azuread-access-token: Secret "azuread-access-token" is invalid: data: Too long: must have at most 1048576 bytes

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.