feat(temporal): anti-drift injection for all provider requests by dwillitzer · Pull Request #2574 · router-for-me/CLIProxyAPI

dwillitzer · 2026-04-06T15:37:57Z

Summary

Adds temporal anti-drift system that injects <temporal> tags into every outbound request to all providers
Format-aware injection: Claude format (top-level system array) vs OpenAI format (messages array) — no 400 errors
Single injection point in conductor covers all providers (Antigravity, Claude, GLM, Qwen, OpenRouter) without per-executor changes

Why This Matters

In long agent sessions, AI models hallucinate dates — wrong days, wrong weeks, wrong months. This cascades into incorrect tool calls, wrong file paths, and stale context decisions. By injecting fresh temporal context on every request, models stay grounded in the current date/time.

Pattern validated in pi-rs where TemporalDriftDetector found real drift in sessions >20 turns.

Changes

New: internal/temporal/temporal.go — temporal tag builder + format-aware payload injection (~120 lines)
Modified: sdk/cliproxy/auth/conductor.go — injects before Execute() and ExecuteStream()
Zero PII — only UTC timestamps, no user data

Test plan

go build -o cliproxyapi ./cmd/server/ — compilation succeeds
Claude API requests no longer return 400 (format-aware system injection)
OpenAI-format requests receive temporal system message in messages array
All providers route correctly with injection enabled
Smoke test: curl -s -X POST localhost:8317/v1/chat/completions -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" -d '{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"what day is it"}],"max_tokens":100}'

🤖 Generated with Claude Code

…quests Injects `<temporal>` tags into every outbound request to prevent AI models from hallucinating dates in long sessions. Format-aware: injects into Claude's top-level `system` array and OpenAI's `messages` array separately. Ported from pi-rs TemporalDriftDetector pattern. Single injection point in conductor covers all providers (Antigravity, Claude, GLM, Qwen, OpenRouter) without per-executor changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request introduces model health monitoring, OpenRouter metadata enrichment, and a temporal injection feature that adds current date and time context to LLM requests. It also expands model metadata in API responses and updates routing logic for Claude-related User-Agents. Feedback focuses on thread-safety issues in the new temporal package, specifically recommending the use of atomic operations for the global request counter to prevent race conditions.

gemini-code-assist · 2026-04-06T15:40:41Z

internal/temporal/temporal.go

+}
+
+// requestCounter tracks how many requests have been seen for interval-based injection.
+var requestCounter uint64


The global variable requestCounter is not thread-safe. In a concurrent environment like a web server, this will lead to race conditions when multiple requests are processed simultaneously. Consider using sync/atomic to increment the counter safely.

gemini-code-assist · 2026-04-06T15:40:41Z

internal/temporal/temporal.go

+	if cfg.InjectInterval <= 0 {
+		return true
+	}
+	requestCounter++


The increment operation on requestCounter is not atomic. Use atomic.AddUint64(&requestCounter, 1) to ensure thread safety.

Suggested change

requestCounter++

atomic.AddUint64(&requestCounter, 1)

gemini-code-assist · 2026-04-06T15:40:41Z

internal/temporal/temporal.go

+		return true
+	}
+	requestCounter++
+	return requestCounter%uint64(cfg.InjectInterval) == 0


The read operation on requestCounter is not atomic. Use atomic.LoadUint64(&requestCounter) to ensure thread safety.

Suggested change

return requestCounter%uint64(cfg.InjectInterval) == 0

return atomic.LoadUint64(&requestCounter)%uint64(cfg.InjectInterval) == 0

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 71a6b44de3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-06T15:44:24Z

internal/temporal/temporal.go

+	system.ForEach(func(_, v gjson.Result) bool {
+		newSystem = append(newSystem, v.Value())


Preserve scalar system prompts during Claude injection

This rewrite path is entered whenever a top-level system key exists, but injectIntoClaudeSystem only preserves existing entries by iterating system.ForEach and never normalizes scalar system values. The repo still supports string system prompts elsewhere (e.g. Claude executor handles system strings), so payloads like {"system":"...","messages":[...]} can have their original system instruction dropped or reshaped when temporal injection runs, changing model behavior.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-06T15:44:24Z

internal/temporal/temporal.go

+	var newMessages []any
+	newMessages = append(newMessages, temporalMsg)


Avoid prepending temporal message before Imagen prompt parse

Prepending the temporal message at messages[0] breaks the Imagen compatibility path that derives prompt text from the first non-empty messages.#.content entry (internal/runtime/executor/gemini_vertex_executor.go loop at lines 125-130). After this change, messages-format Imagen requests can use the injected temporal block as the prompt instead of the user’s prompt, producing incorrect image generations.

Useful? React with 👍 / 👎.

…t, and RefreshModels logic - Use sync/atomic for requestCounter to prevent data races in concurrent server - Handle scalar (string) system field in Claude payloads instead of silently dropping - Use plain string content for OpenAI system messages for broad provider compatibility - Fix RefreshModels lastRefresh logic — use actual timestamp instead of overwriting with time.Now() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 115cb5f0ab

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-06T18:12:53Z

internal/temporal/temporal.go

+	// OpenAI format: messages array exists
+	if messages := gjson.GetBytes(payload, "messages"); messages.IsArray() {
+		return injectIntoMessages(payload, messages, text)
+	}
+
+	return payload


Inject temporal tags for Gemini-style payloads

InjectIntoPayload only handles top-level system and messages, then returns the payload unchanged for other schemas. Native Gemini/Vertex requests use contents payloads, so this path skips temporal injection entirely and anti-drift is not applied on those provider routes, despite this change advertising provider-wide request injection.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-06T18:12:53Z

internal/api/handlers/management/model_definitions.go

+	globalRegistry := registry.GetGlobalRegistry()
+
+	// Get all available models with full metadata
+	models := globalRegistry.GetAvailableModels("openai")


Report health from all registered models

GetModelsHealth starts from GetAvailableModels("openai"), which excludes models whose clients are fully suspended/unavailable. That causes /models/health to omit exactly the models operators need to debug during outages, contradicting the handler’s stated goal of reporting all registered models.

Useful? React with 👍 / 👎.

Imagen and image-gen models use a different request format where convertToImagenRequest extracts the prompt from messages[0].content. Prepending a temporal system message would break prompt extraction. Adds IsImageModel() check and passes model name through InjectIntoPayload. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50ed89fcbd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-06T19:13:00Z

sdk/cliproxy/auth/conductor.go


+	// Inject temporal context into the request payload
+	if temporal.ShouldInject(temporal.DefaultConfig()) {
+		req.Payload = temporal.InjectIntoPayload(req.Payload, req.Model)


Resolve model alias before deciding temporal injection

This injects temporal text based on req.Model before alias/model-pool resolution, so the image-model safeguard can be bypassed when users call an aliased route model (for example, img) that later resolves to an Imagen upstream model. In that case InjectIntoPayload sees a non-image alias and prepends a system message, then executeMixedOnce rewrites to the Imagen model (execReq.Model = upstreamModel), and Imagen prompt extraction can consume the injected block instead of the user prompt. Fresh evidence: alias remapping happens only after this line, so the prior Imagen fix does not apply to aliased image routes.

Useful? React with 👍 / 👎.

…lth endpoint Two fixes from PR review round 2: 1. Move temporal injection from Execute/ExecuteStream outer methods into the inner execution loops after model resolution. This prevents aliased image models (e.g. route "img" → upstream "imagen-3.0-generate-002") from bypassing the IsImageModel safeguard. 2. Add GetAllRegisteredModels registry method and use it in GetModelsHealth so the /models/health endpoint shows ALL models including suspended ones — operators need visibility into unhealthy models during outages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: efb9735bfa

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-06T20:16:22Z

internal/api/handlers/management/model_definitions.go

+		enhancedModels = append(enhancedModels, enhancedModel)
+	}
+	// Build the sources map indicating where context_length came from
+	sources := registry.BuildModelSources(globalRegistry)


Define newly referenced OpenRouter registry functions

This change introduces calls to registry.BuildModelSources, registry.GetOpenRouterLastRefresh, registry.TriggerOpenRouterRefresh, and registry.StartOpenRouterEnrichment, but no corresponding definitions are added in the internal/registry package. That leaves these references unresolved at build time, so server/management packages cannot compile until the symbols are implemented (or the calls are removed).

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-06T20:16:22Z

internal/temporal/temporal.go

+	// OpenAI format: messages array exists
+	if messages := gjson.GetBytes(payload, "messages"); messages.IsArray() {
+		return injectIntoMessages(payload, messages, text)


Inject temporal tags for OpenAI Responses payloads

InjectIntoPayload only injects when a top-level system or messages field is present, and otherwise returns the body unchanged. Requests sent through /v1/responses use the Responses schema (input/instructions) and are forwarded as-is, so temporal anti-drift is skipped for that endpoint family despite this feature being applied in the central conductor path.

Useful? React with 👍 / 👎.

…tadata Polls OpenRouter's public /api/v1/models endpoint to fetch context_length for models that lack this data in the static catalog. Runs on startup and every 24h with manual refresh trigger via management API. This file was missing from previous commits, causing the P0 build failure flagged by codex-connector (unresolved symbols in model_definitions.go). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e8ed002894

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-06T20:52:32Z

internal/registry/openrouter_enrichment.go

+			if info := registry.GetModelInfo(modelID, ""); info != nil {
+				info.ContextLength = ctxLen
+				enrichedCount++


Update registry entries instead of cloned model info

enrichModelsFromOpenRouter calls GetModelInfo and assigns info.ContextLength, but GetModelInfo returns a cloned ModelInfo, so this write is discarded and never updates r.models. That makes enrichment effectively a no-op even though enrichedCount is incremented and logs say models were enriched, leaving downstream model metadata unchanged.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-06T20:52:32Z

internal/api/handlers/management/model_definitions.go

+	lastRefresh := registry.GetOpenRouterLastRefresh()
+	c.JSON(http.StatusOK, gin.H{
+		"status":         "refreshed",
+		"enriched_count": count,


Return true enriched count in refresh response

RefreshModels publishes enriched_count from TriggerOpenRouterRefresh, but that function returns the size of the OpenRouter cache (len(contextLength)), not how many registered models were enriched in this refresh. This can report misleadingly large counts (or wrong error semantics when cache is empty due fetch failure), so operators cannot trust the endpoint output.

Useful? React with 👍 / 👎.

luispater

Summary:
I found two blocking issues in this change set.

Key findings:

Blocking: the temporal injector is enabled unconditionally via temporal.DefaultConfig() in the execution path, so every proxied request now gets a hidden system prompt with no user opt-in and no configuration escape hatch. That changes request semantics for all providers and breaks API compatibility rather than preserving it.
Blocking: internal/registry/openrouter_enrichment.go adds a new http.Client{Timeout: 30 * time.Second} / context.WithTimeout fetch path. This repo explicitly restricts new timeout-driven network behavior to a small set of documented exceptions, and this enrichment path is not one of them.

Test plan:

Reviewed the temporal injection flow and the new registry enrichment loop plus management endpoints.

luispater

Summary:
I like the goal, but this PR currently mixes the temporal injection feature with unrelated registry/management work and introduces a couple of blockers.

Blocking findings:

sdk/cliproxy/auth/conductor.go hard-codes temporal.DefaultConfig() at the injection call sites instead of wiring the feature to runtime config, which makes this always-on request mutation impossible to disable or roll back if it causes provider regressions.
internal/registry/openrouter_enrichment.go adds a new long-lived background HTTP poller with its own timeout/refresh policy. That is a separate runtime/network behavior change from the temporal feature and should be reviewed independently instead of landing as part of this PR.

Test plan:

Reviewed the temporal injection path, the new management endpoints, and the OpenRouter enrichment worker.

Addresses codex-connector review: temporal injection was enabled unconditionally with no opt-in or escape hatch. Now reads temporal settings from the app's runtime config, so users can disable or tune injection via config.yaml: temporal: enabled: true inject_interval: 0 Falls back to DefaultConfig() when no config is loaded. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8fb9745510

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-07T01:03:13Z

sdk/cliproxy/auth/conductor.go

+func (m *Manager) temporalConfig() temporal.Config {
+	cfg, _ := m.runtimeConfig.Load().(*internalconfig.Config)
+	if cfg != nil {
+		return cfg.Temporal


Fall back to temporal defaults when config omits section

temporalConfig() returns cfg.Temporal whenever a runtime config is loaded, but Config.Temporal.Enabled defaults to false when the temporal block is absent in YAML. Because SetConfig always stores a non-nil config, ShouldInject sees Enabled=false and skips injection for all normal deployments unless users explicitly add temporal.enabled: true, which silently disables the feature this commit introduces.

Useful? React with 👍 / 👎.

Two fixes: 1. Temporal tag now includes local timezone fields (local_day, local_date, local_time, local_zone) alongside UTC so models can match user expectations across timezones. 2. Config.Temporal is now a pointer — nil (YAML omitted) falls back to DefaultConfig() (enabled, every request). Previously the bool zero-value silently disabled injection when the block was absent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 956ce7f040

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-07T22:09:46Z

sdk/cliproxy/auth/conductor.go

+			if temporal.ShouldInject(m.temporalConfig()) {
+				execReq.Payload = temporal.InjectIntoPayload(execReq.Payload, upstreamModel)


Apply temporal injection in CountTokens path

Temporal rewriting is added for Execute/ExecuteStream here, but executeCountMixedOnce still sends the original payload to CountTokens. Endpoints that preflight with token-count APIs (for example Claude/Gemini count routes via ExecuteCountWithAuthManager) will undercount because the injected temporal system block is missing there, so a request can pass counting and then fail real execution with token/context limits.

Useful? React with 👍 / 👎.

…ched count Two fixes: 1. enrichModelsFromOpenRouter was writing ContextLength to a cloned ModelInfo from GetModelInfo(), which is silently discarded. Added SetModelContextLength() method that writes to the live registration under its own write lock. 2. TriggerOpenRouterRefresh now returns the actual enriched model count (how many models got new context_length data) instead of the total OpenRouter cache size. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 33f0a789b3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-07T22:34:54Z

sdk/cliproxy/auth/conductor.go

+	if cfg != nil && cfg.Temporal != nil {
+		return *cfg.Temporal


Merge temporal defaults when config block is partially set

temporalConfig() returns *cfg.Temporal as-is whenever the block exists, so omitted YAML fields keep Go zero-values. A config like temporal: { inject_interval: 5 } leaves Enabled=false, causing ShouldInject to skip all requests even though defaults are documented as enabled-by-default; this silently disables the feature for partial configs instead of applying defaults plus overrides.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-07T22:34:54Z

internal/registry/openrouter_enrichment.go

+	for modelID, ctxLen := range openRouterContextLengths {
+		openRouterStore.contextLength[modelID] = ctxLen


Cache OpenRouter source using local model IDs

The enrichment cache stores keys from the OpenRouter response (openRouterContextLengths), but source lookup later is done with local registry IDs via GetOpenRouterContextLengthSource(modelID) in BuildModelSources. For non-exact matches (the same function explicitly supports cases like local gemini-3.1-pro matched from provider-prefixed IDs), source attribution will miss and /models/health reports provider/static even when context length came from OpenRouter.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-07T22:34:54Z

internal/registry/openrouter_enrichment.go

+	// Get all registered models and enrich those lacking context_length
+	allModels := registry.GetAvailableModels("openai")


Include unavailable models in enrichment pass

The enrichment loop uses GetAvailableModels("openai"), which excludes models whose clients are currently suspended/unavailable, but this function is intended to enrich registered models lacking context_length. During outages or quota suspension, those models are skipped entirely and remain unenriched until they become available again, undermining the management health metadata this change introduces.

Useful? React with 👍 / 👎.

gemini-code-assist bot reviewed Apr 6, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 6, 2026

View reviewed changes

luispater requested changes Apr 7, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 7, 2026

View reviewed changes

	return requestCounter%uint64(cfg.InjectInterval) == 0
	return atomic.LoadUint64(&requestCounter)%uint64(cfg.InjectInterval) == 0

		system.ForEach(func(_, v gjson.Result) bool {
		newSystem = append(newSystem, v.Value())

		var newMessages []any
		newMessages = append(newMessages, temporalMsg)

		if temporal.ShouldInject(m.temporalConfig()) {
		execReq.Payload = temporal.InjectIntoPayload(execReq.Payload, upstreamModel)

		for modelID, ctxLen := range openRouterContextLengths {
		openRouterStore.contextLength[modelID] = ctxLen

		// Get all registered models and enrich those lacking context_length
		allModels := registry.GetAvailableModels("openai")

Uh oh!

Conversation

dwillitzer commented Apr 6, 2026

Summary

Why This Matters

Changes

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

luispater left a comment

Choose a reason for hiding this comment

Uh oh!

luispater left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 7, 2026

Choose a reason for hiding this comment