Date: May 14, 2026 Author: Austin Chima Severity: Critical — Complete loss of core functionality Duration: ~4 days (May 10 – May 14, 2026) Status: Resolved and deployed
PRPilot is a GitHub App that automatically reviews pull requests using AI (Google Gemini). It runs as a serverless API on Google Cloud Run. For approximately 4 days, the bot was unable to post any review comments on pull requests, despite successfully receiving and processing webhook events.
The root cause was a premature disposal of an RSA cryptographic key in the JWT generation method. This caused an ObjectDisposedException every time the bot attempted to authenticate with GitHub's API, silently killing the review pipeline.
The fix was a 3-line change to the key lifecycle management. After deployment, the bot immediately began posting reviews under its own bot identity (gatemark[bot]).
┌──────────────┐ webhook ┌────────────────┐ enqueue ┌───────────────────┐
│ GitHub │ ──────────────► │ PRPilot API │ ─────────────► │ ReviewBackground │
│ (PR opened) │ POST /api/ │ (Cloud Run) │ Channel<T> │ Worker │
└──────────────┘ github/webhooks └────────────────┘ └────────┬──────────┘
│
┌───────────────────┘
▼
┌──────────────────────────┐
│ GitHubAppAuthenticator │
│ │
│ 1. Generate JWT │
│ 2. Exchange for token │
│ 3. Create GitHub client │
└──────────┬───────────────┘
│
┌────────────────┘
▼
┌─────────────────────────────────┐
│ GitHubClientAdapter │
│ │
│ • GetPullRequestDiffAsync() │──► GitHub API (read diff)
│ • PostPullRequestCommentAsync()│──► GitHub API (write comment)
└─────────────────────────────────┘
│
diff goes to │
▼
┌─────────────────────────────────┐
│ GeminiCodeReviewService │
│ • GetReviewAsync(diff) │──► Gemini API (AI review)
└─────────────────────────────────┘
GitHub Apps don't use personal access tokens (PATs). They use a two-step authentication process:
- Generate a JWT — The App creates a short-lived JSON Web Token signed with its RSA private key. This JWT proves "I am App #3662925."
- Exchange JWT for Installation Token — The App sends the JWT to GitHub's API, which returns a short-lived installation token scoped to the repos where the App is installed.
- Use the Installation Token — All API calls (read diffs, post comments) use this token. Comments posted with it appear as
gatemark[bot], not as any human user.
What we knew:
- The webhook was reaching Cloud Run (health endpoint responded, logs showed event processing)
- No review comments were appearing on any PRs
- The only AI-style review comment ever found (on PR #1) was posted under the user's personal account, not a bot
Key question: Is the system failing silently, or is it not running at all?
How to check: Production logs on Cloud Run. Previous sessions found Octokit.AuthorizationException: Requires authentication in the logs — confirming the system was running but failing at the GitHub API call step.
Tip
Debugging Lesson #1: Always check production logs first. Don't assume the system isn't running. A "silent failure" (the system runs but produces no output) is a different problem than "the system won't start."
I examined the test suite and found a critical problem: the tests couldn't prove the App auth worked, because they were testing a completely different auth path.
| Test Type | Auth Method | What It Proves |
|---|---|---|
Unit Tests (PRPilot.UnitTests) |
Mocked IGitHubAppAuthenticator |
Business logic works, auth is skipped entirely |
Integration Tests (GitHubClientAdapterTests) |
Personal Access Token (PAT) | The adapter can talk to GitHub as the user, not as the bot |
| Missing | GitHub App JWT → Installation Token | Nothing tested this path |
The integration tests used:
_client.Credentials = new Credentials(_githubToken); // PAT = posts as YOUSo when integration tests passed and posted a comment, it appeared under austinchima — not gatemark[bot]. The passing tests created a false sense of confidence.
Important
Debugging Lesson #2: Verify that your tests actually test the code path that's failing. If your production system uses Auth Method A but your tests use Auth Method B, passing tests tell you nothing about the production failure.
Instead of guessing, I wrote new integration tests that exercise the actual production auth flow:
// This is what production does — NOT what the old tests did
var authenticator = CreateAuthenticator(); // uses App ID + Private Key
var client = await authenticator.CreateInstallationClientAsync(installationId);
await client.PostPullRequestCommentAsync(repo, prNumber, comment);First run results:
✅ CreateInstallationClientAsync_ValidCredentials_ReturnsClient [458 ms]
✅ AppAuth_CanFetchPullRequestDiff [635 ms]
❌ AppAuth_CanPostCommentAsBotIdentity [4 ms] ← CRASH
❌ FullPipeline_AppAuth_FetchDiff_PostComment [4 ms] ← CRASH
The error:
System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Security.Cryptography.RSABCrypt'.
Tip
Debugging Lesson #3: If you can't reproduce it, you can't fix it. The single most valuable thing I did was write a test that exercised the exact same code path as production. The bug appeared immediately.
The stack trace pointed directly at the problem:
at System.Security.Cryptography.RSABCrypt.GetKey()
at ...AsymmetricSignatureProvider.Sign(...)
at ...JsonWebTokenHandler.CreateToken(...)
at PRPilot.Infrastructure.GitHubAppAuthenticator.GenerateJwt() ← HERE
Here's the buggy code:
private string GenerateJwt()
{
using var rsa = RSA.Create(); // ← PROBLEM: disposed at method exit
rsa.ImportFromPem(pem);
var securityKey = new RsaSecurityKey(rsa); // ← holds a REFERENCE to rsa
var credentials = new SigningCredentials(securityKey, SecurityAlgorithms.RsaSha256);
var handler = new JsonWebTokenHandler();
return handler.CreateToken(descriptor); // ← uses rsa through securityKey
}This is the subtle part. Each test creates a new GitHubAppAuthenticator instance with a new MemoryCache. So each test calls CreateInstallationClientAsync() which calls GenerateJwt() fresh.
The using var rsa statement means: "dispose this RSA object when the current scope ends." In the GenerateJwt() method, the RSA object gets disposed after handler.CreateToken() returns. So the first call in a fresh process usually works — the token is created before disposal.
But here's the catch: the RsaSecurityKey and SigningCredentials wrap the RSA object by reference, and the underlying cryptographic provider may retain lazy references to the key's native handle. On the second invocation (in the same process), the .NET cryptographic provider pool can reuse internal state from the previous (now-disposed) RSA object, triggering the ObjectDisposedException.
This is a race condition between disposal and the crypto provider's internal caching. It's non-deterministic — sometimes the first call fails too, depending on GC timing and provider pooling.
Call 1: RSA.Create() → Sign → Dispose → ✅ (token was created before disposal finalized)
Call 2: RSA.Create() → Sign → 💥 ObjectDisposedException
(crypto provider pool references a disposed handle from Call 1)
Important
Debugging Lesson #4: using statements are not always safe. If an object wraps your resource by reference (not by copy), disposing the original destroys the wrapper's reference too. This is especially dangerous with cryptographic keys, database connections, and streams.
The fix ensures the RsaSecurityKey gets its own independent RSA key that nobody else will dispose:
private string GenerateJwt()
{
- using var rsa = RSA.Create();
var pem = _privateKey.Contains("\\n") ? _privateKey.Replace("\\n", "\n") : _privateKey;
+
+ var rsa = RSA.Create();
rsa.ImportFromPem(pem);
- var securityKey = new RsaSecurityKey(rsa);
+ // Export the key parameters and dispose the original safely
+ var rsaParameters = rsa.ExportParameters(includePrivateParameters: true);
+ rsa.Dispose();
+
+ // Create a fresh, standalone RSA key from the exported parameters
+ var signingRsa = RSA.Create(rsaParameters);
+ var securityKey = new RsaSecurityKey(signingRsa);
var credentials = new SigningCredentials(securityKey, SecurityAlgorithms.RsaSha256);Why this works:
rsa.ExportParameters()copies the key material into a plainRSAParametersstruct (a value type — no references)- We dispose the original
rsasafely (we don't need it anymore) RSA.Create(rsaParameters)creates a completely new, independent RSA instanceRsaSecurityKeywraps this new instance — and since nothing else holds a reference to it, nothing can dispose it prematurely
Re-ran all 4 App auth tests:
✅ CreateInstallationClientAsync_ValidCredentials_ReturnsClient [458 ms]
✅ AppAuth_CanFetchPullRequestDiff [564 ms]
✅ AppAuth_CanPostCommentAsBotIdentity [793 ms] ← FIXED
✅ FullPipeline_AppAuth_FetchDiff_PostComment [8 s] ← FIXED
Verified on GitHub: Two comments appeared on PR #7 from gatemark Bot with the [Bot] badge — one test comment and one full AI code review.
Full test suite:
✅ PRPilot.Core.Tests: 9/9 passed
✅ PRPilot.Infrastructure.Tests: 7/7 passed
✅ PRPilot.UnitTests: 10/10 passed
✅ New App Auth Tests: 4/4 passed
❌ Old PAT Tests: 2 failed (expired PAT — pre-existing, unrelated)
- Built container image via Google Cloud Build
- Deployed revision
prpilot-api-00037-qs4to Cloud Run - Health endpoint confirmed:
{"status":"Looking good over here!"}
| Factor | How It Hid the Bug |
|---|---|
| Wrong test coverage | Integration tests used PATs, not App auth — they passed but proved nothing about the failing code path |
| Silent production failure | The ReviewBackgroundWorker caught the exception and logged it, but didn't crash the service. The webhook returned 200 OK to GitHub, so no delivery failures appeared |
| Non-deterministic timing | The first JWT generation sometimes succeeded depending on GC timing, making it look like "auth works but something else is wrong" |
| Misleading earlier fix | A previous session changed AuthenticationType.Oauth to AuthenticationType.Bearer, which was also necessary but masked the deeper RSA disposal bug |
| No end-to-end test existed | Nobody had ever tested: JWT → installation token → post comment in a single test |
-
Start with the architecture — "I built a GitHub App that uses a serverless API on Cloud Run to automatically review PRs using AI."
-
Describe the symptom — "The bot received webhooks and processed them, but never posted comments. Production logs showed authentication failures."
-
Explain the debugging methodology:
- Checked production logs → found the auth exception
- Audited the test suite → found tests used a different auth path (PATs vs App auth)
- Wrote targeted integration tests that exercised the real production code path
- Reproduced the exact error locally on the first run
-
Explain the root cause — "The RSA private key used for JWT signing was wrapped in a
usingstatement. TheRsaSecurityKeyclass holds a reference to the RSA object, not a copy. When theusingblock disposed the RSA key, the security key's reference became invalid, causing anObjectDisposedExceptionon subsequent signing operations." -
Explain the fix — "I exported the RSA parameters to a value-type struct, disposed the original key safely, then created a new standalone RSA instance from those parameters. The
RsaSecurityKeynow owns a key that no other code can dispose." -
Emphasize the process improvements:
- Added 4 new integration tests that cover the real App auth flow
- These tests now serve as a regression guard — if auth breaks again, CI catches it
- Object lifecycle management in C# (
IDisposable,using, reference vs. value semantics) - Cryptographic key management (RSA, JWT signing, key material export)
- OAuth-style authentication flows (JWT → token exchange → scoped API access)
- Serverless debugging (Cloud Run logs, silent failures, health checks)
- Test strategy gaps (unit tests with mocks vs. integration tests that prove real behavior)
- Systematic debugging (don't guess → reproduce → fix → verify → deploy)
- Integration tests must cover the production auth path — not just a substitute (PAT vs App auth)
- Avoid
usingon objects that are wrapped by reference — especially crypto keys, streams, and DB connections - Add alerting for background worker failures — the bot should notify (e.g., Slack, email) when it fails to post a review, not just log it
- End-to-end smoke test after deploy — trigger a test PR automatically after each deployment to confirm the full pipeline works
| File | Change |
|---|---|
| GitHubAppAuthenticator.cs | Fixed RSA key disposal in GenerateJwt() |
| GitHubAppAuthenticatorTests.cs | Added 4 new end-to-end App auth integration tests |
| PRPilot.IntegrationTests.csproj | Added project reference to PRPilot.Infrastructure |