Skip to content

Conversation

@jestradaMS
Copy link
Contributor

Description

This pull request introduces a robust mechanism for handling transient authentication failures during E2E test initialization by adding retry logic for 401 Unauthorized responses. The changes wrap the existing authentication flow with new components that can invalidate cached tokens and automatically retry failed requests, significantly improving test reliability when token acquisition is flaky. The main changes are grouped as follows:

Authentication Retry Mechanism:

  • Added RetryableCredentialProvider, a wrapper around ICredentialProvider that allows invalidating cached tokens and ensures fresh tokens are acquired after failures. (RetryableCredentialProvider.cs, TestFhirServer.cs) [1] [2] [3] [4] [5]
  • Added RetryAuthenticationHttpMessageHandler, a delegating handler that retries HTTP requests on 401 responses with exponential backoff, invalidating tokens as needed. (RetryAuthenticationHttpMessageHandler.cs, TestFhirServer.cs) [1] [2] [3] [4] [5] [6]
  • Integrated the new retry mechanism into the E2E test infrastructure by updating TestFhirServer to use these new handlers and providers for authentication, replacing the previous direct use of AuthenticationHttpMessageHandler. (TestFhirServer.cs, TestFhirClient.cs) [1] [2] [3] [4] [5]

Error Handling Improvements:

  • Added AuthenticationWarmupException, a custom exception to fail the entire test suite quickly if authentication warmup fails, avoiding hundreds of redundant test failures. (AuthenticationWarmupException.cs)

Project Structure and Typing:

  • Updated project file to include the new authentication and retry classes. (Microsoft.Health.Fhir.Shared.Tests.E2E.projitems)
  • Changed method signatures and usages to accept the more general HttpMessageHandler instead of the specific AuthenticationHttpMessageHandler, increasing flexibility. (TestFhirClient.cs, TestFhirServer.cs) [1] [2] [3] [4]

These changes collectively make the E2E test suite more resilient to transient authentication issues and easier to maintain.

Related issues

Addresses [issue #].

Testing

Describe how this change was tested.

FHIR Team Checklist

  • Update the title of the PR to be succinct and less than 65 characters
  • Add a milestone to the PR for the sprint that it is merged (i.e. add S47)
  • Tag the PR with the type of update: Bug, Build, Dependencies, Enhancement, New-Feature or Documentation
  • Tag the PR with Open source, Azure API for FHIR (CosmosDB or common code) or Azure Healthcare APIs (SQL or common code) to specify where this change is intended to be released.
  • Tag the PR with Schema Version backward compatible or Schema Version backward incompatible or Schema Version unchanged if this adds or updates Sql script which is/is not backward compatible with the code.
  • When changing or adding behavior, if your code modifies the system design or changes design assumptions, please create and include an ADR.
  • CI is green before merge Build Status
  • Review squash-merge requirements

Semver Change (docs)

Patch|Skip|Feature|Breaking (reason)

@jestradaMS jestradaMS requested a review from a team as a code owner December 11, 2025 23:04
@jestradaMS
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jestradaMS
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jestradaMS
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jestradaMS
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jestradaMS
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants