Skip to content

Fix GitHub App token expiry on Windows static agents#1516

Open
pruthviraja wants to merge 6 commits intojenkinsci:masterfrom
pruthviraja:fix/windows-credential-manager-token-refresh
Open

Fix GitHub App token expiry on Windows static agents#1516
pruthviraja wants to merge 6 commits intojenkinsci:masterfrom
pruthviraja:fix/windows-credential-manager-token-refresh

Conversation

@pruthviraja
Copy link
Copy Markdown

Summary

Fixes #1515

On Windows, git credential helpers (wincred / git-credential-manager) cache GitHub App installation tokens in Windows Credential Manager and serve them directly on subsequent Git operations, bypassing GIT_ASKPASS entirely. This causes Authentication failed errors every ~1 hour on permanent Windows nodes even though Jenkins correctly generates a fresh token.

Root cause: PR #291 fixed token refresh for Linux static agents via DelegatingGitHubAppCredentials. On Linux, Git always calls GIT_ASKPASS to get credentials. On Windows, Git checks Windows Credential Manager first — if a cached entry exists it never calls GIT_ASKPASS, so the expired cached token is used instead of the fresh one Jenkins prepared.

Observed pattern from the field:

Build 7  ✅ 1:19 PM  (token generated)
Build 14 ❌ 2:34 PM  (~1hr 15min later — token expired in WCM)
Build 15 ✅ 2:35 PM  (immediate retry — fresh token)
Build 16 ❌ 4:12 PM  (~1hr 37min later — token expired again)
Build 17 ✅ 4:12 PM  (immediate retry — fresh token)

Fix

Inside DelegatingGitHubAppCredentials.getPassword() (runs on the agent JVM), after obtaining a fresh token, detect Windows via os.name and run:

cmdkey /delete:git:https://<host>
cmdkey /delete:LegacyGenericCredential:https://<host>

This evicts the stale cached entry so Git falls through to GIT_ASKPASS and uses the fresh token Jenkins provides. Both the modern (git:) and legacy (LegacyGenericCredential:) key formats are cleared to cover all Windows Git credential helper variants.

The clearing is:

  • No-op when WCM has no entry for that host (cmdkey exits 1, silently ignored)
  • Skipped entirely on non-Windows agents (Linux, macOS)
  • Skipped on ephemeral agents where WCM is empty at startup anyway

Changes

  • GitHubAppCredentials.java

    • CLEAR_WINDOWS_CREDENTIAL_MANAGER_CACHE — public flag (default true), disableable from script console without redeployment
    • windowsCredentialCleaner — replaceable Consumer<String> so tests never invoke cmdkey
    • deriveGitHostFromApiUri() — maps https://api.github.comgithub.com; passes GHE host through unchanged
    • clearWindowsCredentialManagerCache() — clears both WCM key formats
    • DelegatingGitHubAppCredentials.apiUri — stored in plaintext (not sensitive) for host derivation on the agent
    • DelegatingGitHubAppCredentials.getPassword() — calls cache clearing after token refresh, outside the synchronized block
  • GithubAppCredentialsWindowsAgentTest.java (new)

    • Unit tests for deriveGitHostFromApiUri() (standard GitHub, GHE, port, malformed URI, empty)
    • Unit tests for clearWindowsCredentialManagerCache() key format and flag behaviour
    • Uses a recording stub for windowsCredentialCleaner — runs on any OS, never invokes cmdkey

Verification

Tested on GKE Jenkins controller with a permanent Windows agent (Git 2.51.0.windows.2). Agent log after triggering token refresh:

Cleared Windows Credential Manager entry: git:https://github.com (exit: 0)
Cleared Windows Credential Manager entry: LegacyGenericCredential:https://github.com (exit: 1)

Exit 0 = entry evicted. Exit 1 on LegacyGenericCredential = key not present (normal — this Windows agent uses the modern git: format only). Both checkouts succeeded after the 60s stale threshold was crossed.

Backwards compatibility

Scenario Behaviour
Linux / macOS agents os.name check fails — no cmdkey call, zero change
Ephemeral Windows agents WCM is empty at pod startup; cmdkey /delete is a no-op (exit 1)
GHE instances deriveGitHostFromApiUri extracts GHE host correctly
cmdkey not on PATH Exception caught, logged at WARNING, build continues
Opt-out Set GitHubAppCredentials.CLEAR_WINDOWS_CREDENTIAL_MANAGER_CACHE=false via system property or script console

@pruthviraja
Copy link
Copy Markdown
Author

Hi maintainers — could someone please trigger CI and review this?
This fixes a real production issue on Windows static agents where GitHub App
tokens expire every ~1 hr. Happy to address any feedback.

@pruthviraja pruthviraja force-pushed the fix/windows-credential-manager-token-refresh branch from adb23b8 to 951c238 Compare April 23, 2026 13:24
On Windows, git credential helpers (wincred / git-credential-manager)
cache GitHub App installation tokens in Windows Credential Manager and
serve them directly on subsequent git operations, bypassing GIT_ASKPASS
entirely. This causes authentication failures every ~1 hour on permanent
Windows nodes even though Jenkins correctly generates a fresh token.

Fix: inside DelegatingGitHubAppCredentials.getPassword() (which runs on
the agent JVM), detect when running on a Windows agent and call
  cmdkey /delete:git:https://<host>
  cmdkey /delete:LegacyGenericCredential:https://<host>
immediately after obtaining a (possibly refreshed) token. This evicts
the stale cached entry so that git falls through to GIT_ASKPASS and
uses the fresh token Jenkins is about to provide, rather than the
expired token Windows Credential Manager has cached from a prior build.

The clearing is a no-op when the Credential Manager has no entry for
that host (cmdkey exits 1, which is silently ignored). It is skipped
entirely on non-Windows agents (Linux, macOS) and on ephemeral agents
where Windows Credential Manager is empty at startup anyway.

Also adds:
- deriveGitHostFromApiUri(): maps https://api.github.com -> github.com
  and passes GHE host through unchanged.
- clearWindowsCredentialManagerCache(): package-private for testing,
  uses a replaceable Consumer<String> so tests never invoke cmdkey.
- CLEAR_WINDOWS_CREDENTIAL_MANAGER_CACHE flag (default true): allows
  the behaviour to be disabled from the Jenkins script console if
  needed without redeploying the plugin.
- GithubAppCredentialsWindowsAgentTest: unit tests for all helper
  methods, run on any OS via a recording stub for the cleaner.

Verified on GKE Jenkins controller with a permanent Windows agent:
  git:https://github.com (exit: 0)          <- entry evicted
  LegacyGenericCredential:https://github.com (exit: 1) <- not present (normal)
Both checkouts succeeded after the 60s stale threshold was crossed.

Fixes: jenkinsci#1515
SpotBugs does not flag Consumer<String> fields as MS_SHOULD_BE_FINAL
so the suppression annotation is redundant and triggers
US_USELESS_SUPPRESSION_ON_FIELD.
The test verifies a strict log message sequence using contains().
On Windows CI agents our fix correctly fires cmdkey /delete, producing
'Cleared Windows Credential Manager entry' log lines that interleave
with the expected sequence and break the assertion.

Disable CLEAR_WINDOWS_CREDENTIAL_MANAGER_CACHE for the duration of
this test (save + restore in the existing finally block) since the test
is exercising token-refresh logic, not credential manager behaviour.
Windows Credential Manager clearing is covered by
GithubAppCredentialsWindowsAgentTest.
@pruthviraja pruthviraja force-pushed the fix/windows-credential-manager-token-refresh branch from 3bce4c7 to 6490841 Compare April 23, 2026 19:28
@pruthviraja pruthviraja requested a review from a team as a code owner May 5, 2026 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GitHub App token not refreshed on Windows permanent/static agents - builds fail every ~1hr with "Invalid username or token"

1 participant