Skip to content

Conversation

@amirejaz
Copy link
Contributor

@amirejaz amirejaz commented Nov 25, 2025

Summary

This PR implements automatic discovery and resolution of MCPExternalAuthConfig resources for backend MCPServers in the VirtualMCPServer controller. The feature enables VirtualMCPServer to automatically discover and apply external authentication configurations (e.g., OAuth2 Token Exchange, Header Injection) from referenced MCPServer resources without requiring manual configuration.

Changes

Core Implementation

  • discoverExternalAuthConfigs: Discovers ExternalAuthConfig from MCPServer resources in the group and adds them to the outgoing auth configuration
  • buildOutgoingAuthConfig: Builds OutgoingAuthConfig by discovering ExternalAuthConfig from MCPServer resources when using "discovered" source mode
  • convertExternalAuthConfigToStrategy: Converts MCPExternalAuthConfig CRD resources to internal BackendAuthStrategy format using typed structures (from PR Refactor/typed backend auth strategy #2797), handling token exchange and header injection configurations
  • convertBackendAuthConfigToVMCP: Converts inline BackendAuthConfig from CRD spec to BackendAuthStrategy, supporting both direct references and ExternalAuthConfigRef references
  • Updated discoverBackends: Now uses the resolved OutgoingAuthConfig (including discovered external auth configs) when creating the UnifiedBackendDiscoverer
  • Updated ensureVmcpConfigConfigMap: Ensures that fully resolved OutgoingAuthConfig (including discovered external auths) is written to the ConfigMap consumed by VirtualMCPServer pods

Secret Management

  • discoverExternalAuthConfigSecrets: Discovers ExternalAuthConfigs from MCPServers and returns environment variables for their client secrets (used in discovered mode)
  • discoverInlineExternalAuthConfigSecrets: Discovers ExternalAuthConfigs referenced in inline Backends and returns environment variables for their client secrets
  • getExternalAuthConfigSecretEnvVar: Returns an environment variable for the client secret from an ExternalAuthConfig (for token exchange with ClientSecretRef)
  • buildOutgoingAuthEnvVars: Builds environment variables for outgoing auth secrets and mounts them in the deployment

Features

  • Two source modes supported:

    • discovered: Automatically discover auth configs from all referenced MCPServers
    • inline: Use only explicitly specified auth configs (existing behavior)
  • Inline overrides in discovered mode: When using discovered mode, you can still provide inline overrides in the Backends map, which take precedence over discovered configs

  • Token Exchange support: Full support for OAuth2 Token Exchange (RFC 8693) including:

    • Token URL, client ID, audience, scopes
    • Client secret references (via Kubernetes Secrets, automatically mounted as environment variables)
    • Subject token type normalization
  • Header Injection support: Support for header injection authentication with:

    • Header name and value configuration
    • Secret value resolution from Kubernetes Secrets

Testing

  • Unit tests: Comprehensive test coverage in virtualmcpserver_externalauth_test.go (918 lines):

    • TestConvertExternalAuthConfigToStrategy: Tests conversion logic with various configurations (token exchange, header injection)
    • TestBuildOutgoingAuthConfig: Tests building OutgoingAuthConfig in discovered and inline modes
    • TestConvertBackendAuthConfigToVMCP: Tests inline config conversion
    • TestDiscoverBackendsWithExternalAuthConfigIntegration: Integration test for end-to-end discovery flow
  • E2E tests: Existing e2e tests in test/e2e/thv-operator/virtualmcp/ cover discovered mode scenarios

How It Works

  1. When a VirtualMCPServer is created/updated, the controller discovers all referenced MCPServers from the specified groups

  2. For each MCPServer with an ExternalAuthConfigRef, the controller:

    • Fetches the MCPExternalAuthConfig resource
    • Converts it to a typed BackendAuthStrategy (using authtypes.BackendAuthStrategy from PR Refactor/typed backend auth strategy #2797)
    • Adds it to the OutgoingAuthConfig for that backend
  3. The resolved configuration (including discovered auth configs) is written to the vmcp ConfigMap with proper nested YAML structure (e.g., token_exchange: { ... })

  4. Client secrets from ExternalAuthConfigs are automatically discovered and mounted as environment variables in the VirtualMCPServer deployment

  5. The VirtualMCPServer pod consumes the ConfigMap and uses the auth strategies when communicating with backends

Example Usage

# MCPExternalAuthConfig defines token exchange configuration
apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPExternalAuthConfig
metadata:
  name: backend-1-auth-config
spec:
  type: tokenExchange
  tokenExchange:
    tokenUrl: https://oauth.example.com/token
    clientId: my-client-id
    clientSecretRef:
      name: oauth-secret
      key: client-secret
    audience: backend-service
    scopes: [read, write]

# MCPServer references the ExternalAuthConfig
apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPServer
metadata:
  name: backend-1
spec:
  groupRef: my-group
  externalAuthConfigRef:
    name: backend-1-auth-config
  # ... other spec fields

# VirtualMCPServer automatically discovers and applies the auth config
apiVersion: toolhive.stacklok.dev/v1alpha1
kind: VirtualMCPServer
metadata:
  name: my-vmcp
spec:
  groupRef:
    name: my-group
  outgoingAuth:
    source: discovered  # Automatically discover from MCPServers

Notes

  • Client secrets are automatically discovered and mounted as environment variables in the deployment for both discovered and inline modes
  • The implementation gracefully handles missing or invalid MCPExternalAuthConfig resources by logging and skipping affected backends
  • Inline configurations in the Backends map always take precedence over discovered configurations when both are present
  • This PR integrates with PR Refactor/typed backend auth strategy #2797 which introduced typed BackendAuthStrategy structures, ensuring proper YAML marshaling that the vmcp CLI expects

Related Issues

Fixes the issue where ExternalAuthConfig was not being resolved and used by the VirtualMCPServer controller to populate backend authentication configurations.

Testing

  • Unit tests pass
  • Linting passes
  • Verified ConfigMap contains discovered auth configs with correct nested YAML structure
  • Verified secret mounting works for discovered ExternalAuthConfigs
  • Verified all source modes work correctly (discovered, inline)
  • Verified inline overrides work in discovered mode

Large PR Justification

This PR implements a complete, atomic feature for automatic discovery and application of ExternalAuthConfig from MCPServers. The feature requires both discovery logic and secret management to function together—they cannot work independently. The PR includes 918 lines of comprehensive tests verifying end-to-end functionality, and integrates with PR #2797's typed BackendAuthStrategy changes. Splitting would create incomplete intermediate states that don't work, require duplicate logic, and delay integration testing. The size reflects comprehensive functionality and extensive test coverage, not code bloat.

@github-actions github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Nov 25, 2025
@codecov
Copy link

codecov bot commented Nov 25, 2025

Codecov Report

❌ Patch coverage is 33.90558% with 154 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.46%. Comparing base (7420133) to head (a716ccc).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...perator/controllers/virtualmcpserver_deployment.go 9.84% 119 Missing ⚠️
...perator/controllers/virtualmcpserver_controller.go 68.96% 19 Missing and 8 partials ⚠️
...perator/controllers/virtualmcpserver_vmcpconfig.go 42.85% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2726      +/-   ##
==========================================
- Coverage   56.55%   56.46%   -0.09%     
==========================================
  Files         320      320              
  Lines       30874    31097     +223     
==========================================
+ Hits        17460    17559      +99     
- Misses      11911    12026     +115     
- Partials     1503     1512       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@amirejaz amirejaz requested review from jhrozek and yrobla November 25, 2025 11:56
@amirejaz
Copy link
Contributor Author

implements #2704

@github-actions github-actions bot removed the size/XL Extra large PR: 1000+ lines changed label Nov 27, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Nov 27, 2025
@amirejaz amirejaz marked this pull request as draft November 27, 2025 14:12
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Nov 27, 2025
@amirejaz amirejaz force-pushed the vmcp-external-auth-discovery branch from 00b55a4 to a716ccc Compare December 1, 2025 15:44
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Dec 1, 2025
@amirejaz amirejaz force-pushed the vmcp-external-auth-discovery branch from a716ccc to 3da9d52 Compare December 1, 2025 16:15
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Dec 1, 2025
@amirejaz amirejaz marked this pull request as ready for review December 1, 2025 16:21
Copy link
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are couple of issues that need to be resolved prior to merging

// OutgoingAuthSourceDiscovered indicates that auth configs should be automatically discovered from MCPServers
OutgoingAuthSourceDiscovered = "discovered"
// OutgoingAuthSourceInline indicates that auth configs should be explicitly specified
OutgoingAuthSourceInline = "inline"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to solve this in this PR but the constant appears to be unused. In general I think we should codify what should the different auth sources mean and do.


// For header injection, resolve secrets from Kubernetes
if externalAuthConfig.Spec.Type == mcpv1alpha1.ExternalAuthTypeHeaderInjection {
strategy, err = converter.ResolveSecrets(ctx, externalAuthConfig, r.Client, namespace, strategy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry but I think this is wrong, I think what should be happening instead is us setting header_value_env instead and mount the secrets into the deployment as env vars. The reason is that we don't want secrets in a configmap.

I think we might want to remove the option to set the values directly from both the runtime and the CRD and rely exclusively on secrets...

}

// Use the standard env var name from the converter
envVarName := "TOOLHIVE_TOKEN_EXCHANGE_CLIENT_SECRET"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I read it correctly that if multiple back ends use token exchange they will clobber the single value? I think we should use some back end prefix or suffix.


for _, workloadName := range workloadNames {
mcpServer := &mcpv1alpha1.MCPServer{}
if err := r.Get(ctx, types.NamespacedName{Name: workloadName, Namespace: vmcp.Namespace}, mcpServer); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be easier to do a full List and then filter by names rather than doing N Get calls for the names. But that's an optimization for layer maybe..worth a comment with a TODO now?


// Get workload names from the group
workloadDiscoverer := workloads.NewK8SDiscovererWithClient(r.Client, vmcp.Namespace)
workloadNames, err := workloadDiscoverer.ListWorkloadsInGroup(ctx, vmcp.Spec.GroupRef.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could absolutely be a follow up issue, but note that listing workloads in group at 2 different moments can give you 2 different sets (here and in ensureVmcpConfigConfigMap). I would prefer to do 1 List and pass the results around

if err != nil {
return nil, fmt.Errorf("failed to convert default auth config: %w", err)
}
outgoing.Default = defaultStrategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think continuing in a degraded mode is fine, but it made me realize we migth want to expose conditions for observability in vMCP. Let's file an issue (I can do it if you prefer)

if err != nil {
ctxLogger := log.FromContext(ctx)
ctxLogger.V(1).Info("Failed to get Default ExternalAuthConfig secret, continuing without it",
"error", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about conditions, continuing in a degraded mode is fine, but the admin should know about it

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants