elizaos-plugins · odilitime · Feb 14, 2026 · Feb 18, 2026 · Feb 18, 2026 · Feb 18, 2026
diff --git a/.gitignore b/.gitignore
@@ -2,4 +2,7 @@ node_modules
 .turbo
 dist
 .env
-.elizadb-test
+.elizadb-test
+
+# prr state file (auto-generated)
+.pr-resolver-state.json
diff --git a/.prr/lessons.md b/.prr/lessons.md
@@ -0,0 +1,23 @@
+# PRR Lessons Learned
+
+> This file is auto-generated by [prr](https://github.com/elizaOS/prr).
+> It contains lessons learned from PR review fixes to help improve future fix attempts.
+> You can edit this file manually or let prr update it.
+> To share lessons across your team, commit this file to your repo.
+
+## File-Specific Lessons
+
+### src/index.ts
+
+- Fix for src/index.ts:349 - tool modified wrong files (src/banner.ts), need to modify src/index.ts
+- Fix for src/index.ts:349 - Looking at the code, I can see that the streaming test at line 333-373 **already has stream: true set** at line 345.
+
+### src/utils/rate-limiter.ts
+
+- Fix for src/utils/rate-limiter.ts - The current code at lines 396-401 already has a proper fix that checks for NaN: typescript
+- Fix for src/utils/rate-limiter.ts - RESULT: ALREADY_FIXED — The code at lines 396-401 already includes a proper NaN check: retrySeconds !
+- Fix for src/utils/rate-limiter.ts - The review comment mentions that parseFloat on retry-after can return NaN and the || undefined fallback never triggers, but the current code at lines 396-401 already has a proper fix: typescript
+- Fix for src/utils/rate-limiter.ts - I can see that the code at lines 296-306 already has a proper fix that checks for NaN and handles zero values correctly: typescript
+- Fix for src/utils/rate-limiter.ts:248 - tool modified wrong files (src/banner.ts, src/models/embedding.ts), need to modify src/utils/rate-limiter.ts
+- Fix for src/utils/rate-limiter.ts:248 - The cleanup condition must account for map updates during concurrent calls—identity checks fail when other acquire() calls mutate the map before cleanup runs.
+- Fix for src/utils/rate-limiter.ts:248 - The cleanup condition compares a stale promise reference against a freshly-created derived promise — they're guaranteed different objects on every subsequent call.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,74 @@
+# Changelog
+
+All notable changes to `@elizaos/plugin-openai` are documented in this file.
+Format based on [Keep a Changelog](https://keepachangelog.com/). Newest entries first.
+
+## [Unreleased]
+
+### Added
+
+- **Billing 429 detection and fail-fast behavior** (`src/utils/rate-limiter.ts`)
+  - New `QuotaExceededError` class for permanent billing failures (distinct from transient `RateLimitError`).
+  - `extractRateLimitInfo()` now returns `isBillingError: boolean` to distinguish quota exhaustion from rate limiting.
+  - Billing errors detected via OpenAI's `error.code === "insufficient_quota"` (most reliable) with keyword fallback ("quota", "billing").
+  - `withRateLimit()` now fails immediately on billing 429s instead of wasting 30+ seconds on 5 retries.
+  - **WHY**: OpenAI uses HTTP 429 for both "wait a minute" (rate limit) and "add credits" (quota). Before this change, quota errors would retry pointlessly, wasting time and filling logs. Now they fail instantly with a clear billing URL.
+
+- **Enhanced `throwIfRateLimited` for body inspection** (`src/utils/rate-limiter.ts`)
+  - Changed from synchronous to async to read 429 response bodies.
+  - Uses `response.clone().json()` to inspect error structure without consuming the original response.
+  - Throws `QuotaExceededError` for billing 429s, `RateLimitError` for rate-limit 429s.
+  - Updated all 5 call sites (`embedding.ts`, `image.ts` ×2, `audio.ts` ×2) to `await throwIfRateLimited(response)`.
+  - **WHY**: Raw-fetch handlers (embeddings, images, audio) need to distinguish billing from rate-limit 429s. Reading the body is the only reliable way to detect OpenAI's quota exhaustion errors.
+
+- **Startup configuration banner** (`src/banner.ts`, `src/init.ts`)
+  - Displays compact config table on initialization showing API key (masked), base URL, models, and their status (set/default).
+  - Always includes direct link to OpenAI billing dashboard.
+  - **WHY**: Configuration issues are the #1 source of runtime errors. The banner makes misconfigurations immediately obvious (e.g., missing key, wrong base URL, unexpected model) without requiring users to dig through env vars or character settings.
+
+- **Automatic tier detection** (`src/utils/rate-limiter.ts`, model handlers)
+  - New `logTierOnce(response)` function extracts RPM/TPM limits from `x-ratelimit-limit-*` headers.
+  - Logs account tier info after first successful API call (one-shot, zero cost).
+  - Integrated into all raw-fetch handlers (`embedding.ts`, `image.ts`, `audio.ts`).
+  - Silently skips if headers missing (Azure, Ollama, etc.).
+  - **WHY**: Users often don't know their OpenAI tier, which determines quota limits. Tier doesn't change during runtime, so logging it once from a response we're already processing has zero cost and helps users understand their limits.
+
+- **Daemon-based rate limiting** (`src/utils/rate-limiter.ts`)
+  - Process-level singleton that persists across agent reinitializations within the same Node process.
+  - Per-category sliding-window RPM tracking (embeddings, chat, images, audio) — mirrors how OpenAI actually measures rate limits.
+  - Exponential backoff with jitter on 429 errors, using the `Retry-After` header when available for optimal timing.
+  - `withRateLimit(category, fn)` wrapper for transparent retry on rate-limited API calls.
+  - `acquireRateLimit(category)` for throttling without retry (used for streaming where transparent retry isn't possible).
+  - `throwIfRateLimited(response)` for raw-fetch handlers to convert 429 responses into typed `RateLimitError` before generic error handling.
+  - Configurable via `OPENAI_RATE_LIMIT_RPM` (global RPM override) and `OPENAI_RATE_LIMIT_MAX_RETRIES` (retry count override).
+
+- **Forward/backward compatibility shims** (`src/types/index.ts`)
+  - `StreamingTextParams` — extends `GenerateTextParams` with `stream` and `onStreamChunk` for older `@elizaos/core` versions that don't export these.
+  - `TextStreamResult` — local definition of the streaming result type for older cores.
+  - These shims are structurally identical to the newer core types, so they become redundant (but harmless) when the core is upgraded.
+
+- **Synthetic embedding fast-paths** (`src/models/embedding.ts`)
+  - Returns synthetic vectors (no API call) for `null` params, empty text, and the `"test"` string.
+  - The `"test"` fast-path is specifically for the core runtime's `ensureEmbeddingDimension()` probe, which sends `{ text: "test" }` at startup to discover vector length. Since we already know the dimension from `OPENAI_EMBEDDING_DIMENSIONS`, the API call was wasteful and consumed rate-limit budget at the worst time (startup, when other initialization calls may be in-flight).
+
+### Changed
+
+- **Non-blocking plugin initialization** (`src/init.ts`)
+  - Removed the eager `GET /models` API key validation fetch that ran during `init()`.
+  - **Why**: The validation raced with the core's `ensureEmbeddingDimension()` embedding probe for the same rate-limit budget. The validation would succeed (consuming a slot), then the embedding probe would get 429'd — triggering up to 5 retries with exponential backoff, adding 30+ seconds to startup. A bad API key surfaces on the first real model call anyway, so the eager check provided no actionable value.
+  - Now performs synchronous config presence checks only.
+
+- **DRY model registration** (`src/index.ts`)
+  - Replaced 11 redundant lambda wrappers (`async (runtime, params) => handler(runtime, params)`) with direct function references (`handleTextEmbedding`, `handleTextSmall`, etc.). Eliminated ~70 lines of boilerplate with zero behavioral change.
+  - Applied `as unknown as NonNullable<Plugin["models"]>` type assertion to work around a TypeScript contravariance issue (TS2418) in older `@elizaos/core` versions where the `Plugin.models` type has an incompatible intersection with a string index signature.
+
+- **Rate-limited model handlers**
+  - `embedding.ts` — wrapped in `withRateLimit("embeddings", ...)` with `throwIfRateLimited` for 429 detection.
+  - `text.ts` — `acquireRateLimit("chat")` for streaming, `withRateLimit("chat", ...)` for non-streaming.
+  - `object.ts` — wrapped in `withRateLimit("chat", ...)`.
+  - `image.ts` — generation wrapped in `withRateLimit("images", ...)`, description in `withRateLimit("chat", ...")`.
+  - `audio.ts` — both TTS and transcription wrapped in `withRateLimit("audio", ...)`.
+
+## [1.6.1] - 2025-01-01
+
+_Baseline version before rate limiting and compatibility changes. No changelog entries recorded for prior versions._