fix(claude-auth): dedupe OAuth refresh and honor 429 backoff by codeg-dev · Pull Request #2568 · router-for-me/CLIProxyAPI

codeg-dev · 2026-04-06T08:18:51Z

Summary

dedupe concurrent Claude OAuth refresh requests per refresh token with singleflight
honor Retry-After for refresh endpoint 429 responses and block immediate replays
route the Claude executor through the retry-aware refresh path and add Claude auth regression coverage

Validation

go test ./internal/auth/claude
go test ./internal/runtime/executor -run TestApplyClaudeHeaders
verified the patched Linux binary on a live CLIProxyAPI deployment serving Claude requests successfully

AI authorship: prepared with OpenCode / Sisyphus assistance.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

gemini-code-assist

Code Review

This pull request enhances the Claude token refresh logic by implementing request deduplication using singleflight and a backoff mechanism for rate-limited (429) responses. It also introduces structured error handling to distinguish between retryable and non-retryable failures. Feedback includes addressing a potential data race in the state reset function, preventing a memory leak in the refresh block map, improving context handling in the single-flight logic, and refining error reporting during retries.

gemini-code-assist · 2026-04-06T08:20:24Z

internal/auth/claude/anthropic_auth.go

+func resetClaudeRefreshState() {
+	claudeRefreshMu.Lock()
+	defer claudeRefreshMu.Unlock()
+	claudeRefreshBlock = make(map[string]time.Time)
+	claudeRefreshGroup = singleflight.Group{}
+}


Reassigning the global claudeRefreshGroup in resetClaudeRefreshState while other goroutines might be calling Do on it in RefreshTokens creates a data race. singleflight.Group is a struct containing internal state (maps and mutexes), and reinitializing it without synchronization is unsafe. Since this function appears to be used for testing, consider using a pointer for the group and protecting its access/reassignment with a mutex, or avoiding the reset entirely by using unique keys in tests.

gemini-code-assist · 2026-04-06T08:20:24Z

internal/auth/claude/anthropic_auth.go

+var (
+	claudeRefreshGroup singleflight.Group
+	claudeRefreshMu    sync.Mutex
+	claudeRefreshBlock = make(map[string]time.Time)


The claudeRefreshBlock map grows indefinitely as it stores a block timestamp for every refresh token that encounters a 429 error. There is no mechanism to prune expired entries, and entries are only removed on a successful refresh. In a long-running proxy handling many unique tokens, this will lead to a gradual memory leak. Consider using a TTL-based cache or implementing a periodic cleanup routine to remove entries where the block time has passed.

gemini-code-assist · 2026-04-06T08:20:24Z

internal/auth/claude/anthropic_auth.go

+	result, err, _ := claudeRefreshGroup.Do(refreshToken, func() (interface{}, error) {
+		return o.refreshTokensSingleFlight(context.WithoutCancel(ctx), refreshToken)
+	})
+	if err != nil {
+		return nil, err
+	}
+	tokenData, ok := result.(*ClaudeTokenData)
+	if !ok || tokenData == nil {
+		return nil, fmt.Errorf("token refresh failed: invalid single-flight result")
+	}
+	return tokenData, nil


singleflight.Do blocks the calling goroutine until the underlying function returns, even if the caller's context is canceled. In a high-concurrency environment, this can lead to goroutine accumulation if upstream refreshes are slow and clients disconnect. It is better to use DoChan and a select statement to respect ctx.Done(), while allowing the refresh to continue in the background via context.WithoutCancel.

ch := claudeRefreshGroup.DoChan(refreshToken, func() (interface{}, error) { return o.refreshTokensSingleFlight(context.WithoutCancel(ctx), refreshToken) }) select { case <-ctx.Done(): return nil, ctx.Err() case res := <-ch: if res.Err != nil { return nil, res.Err } tokenData, ok := res.Val.(*ClaudeTokenData) if !ok || tokenData == nil { return nil, fmt.Errorf("token refresh failed: invalid single-flight result") } return tokenData, nil }

gemini-code-assist · 2026-04-06T08:20:24Z

internal/auth/claude/anthropic_auth.go

+		if !isClaudeRefreshRetryable(err) {
+			break
+		}
 	}

 	return nil, fmt.Errorf("token refresh failed after %d attempts: %w", maxRetries, lastErr)


When a non-retryable error occurs (like a 401 or a 429 block), the loop breaks immediately, but the final error message still claims the failure happened 'after %d attempts' (using the maximum retry count). This is misleading to the user. It would be better to return the error immediately if it's not retryable, or accurately report the number of attempts made.

Suggested change

if !isClaudeRefreshRetryable(err) {

break

}

}

return nil, fmt.Errorf("token refresh failed after %d attempts: %w", maxRetries, lastErr)

if !isClaudeRefreshRetryable(err) {

return nil, err

}

}

return nil, fmt.Errorf("token refresh failed after %d attempts: %w", maxRetries, lastErr)

codeg-dev and others added 2 commits April 6, 2026 17:16

fix(claude-auth): dedupe OAuth refresh and honor 429 backoff

6431cec

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

fix(executor): route Claude refresh through retry-aware auth

29e32aa

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

gemini-code-assist bot reviewed Apr 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(claude-auth): dedupe OAuth refresh and honor 429 backoff#2568

fix(claude-auth): dedupe OAuth refresh and honor 429 backoff#2568
codeg-dev wants to merge 2 commits intorouter-for-me:mainfrom
codeg-dev:fix/claude-refresh-429-backoff

codeg-dev commented Apr 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 6, 2026

Uh oh!

gemini-code-assist bot Apr 6, 2026

Uh oh!

gemini-code-assist bot Apr 6, 2026

Uh oh!

gemini-code-assist bot Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

codeg-dev commented Apr 6, 2026

Summary

Validation

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant