Skip to content

fix: gracefully truncate conversation history when context window is …#545

Open
HasanAlsaafen wants to merge 1 commit into
rowboatlabs:mainfrom
HasanAlsaafen:feat/context-window-truncatio
Open

fix: gracefully truncate conversation history when context window is …#545
HasanAlsaafen wants to merge 1 commit into
rowboatlabs:mainfrom
HasanAlsaafen:feat/context-window-truncatio

Conversation

@HasanAlsaafen

@HasanAlsaafen HasanAlsaafen commented May 9, 2026

Copy link
Copy Markdown

fix: gracefully truncate conversation history when context window is exceeded

Closes #514

Problem

When a conversation grows long, Rowboat passes the full message history to the
model API unchanged. If the history exceeds the model's context window limit the
API returns a hard 400 bad_request_error (context window exceeds limit) and
the error is surfaced directly to the user with no recovery.

Solution

Add a lightweight truncateMessagesToFit() utility
(apps/x/packages/core/src/agents/context-utils.ts) that is called inside
streamLlm() — the single callsite that builds the payload sent to the AI SDK —
before convertFromMessages() runs.

How it works

  1. System messages are always preserved — they carry agent instructions and
    are never dropped regardless of length.
  2. Oldest non-system messages are dropped first — the most recent context is
    the most valuable, so we walk the history newest → oldest and stop when the
    token budget is exhausted.
  3. Tool-result / tool-call pairing is respected — if dropping old messages
    leaves a tool (tool-result) message at the head of the kept list, it is
    also dropped, because the AI SDK requires every tool-result to have a
    preceding tool-call message.
  4. Token counting is heuristic — we use ceil(charCount / 4) which is a
    well-known approximation that intentionally over-estimates to remain
    conservative. A tokenizer library would be more precise but adds a
    dependency; the heuristic is sufficient for truncation purposes.

Default budget

80,000 tokens. This comfortably fits within the context windows of every
model currently supported by Rowboat (Claude 3.x, GPT-4o, Gemini 1.5, Llama 3,
etc.) while leaving headroom for the system prompt and the model's output.

Files changed

File Change
apps/x/packages/core/src/agents/context-utils.ts New — truncateMessagesToFit() utility
apps/x/packages/core/src/agents/runtime.ts Apply truncation inside streamLlm() before sending to AI SDK

Testing

The utility has no external dependencies and can be exercised with a simple
Node script:

import { truncateMessagesToFit } from "./context-utils.js";
 
const msgs = [
  { role: "system",    content: "You are a helpful assistant." },
  { role: "user",      content: "A".repeat(400_000) }, // ~100k tokens — over budget
  { role: "assistant", content: "Here is my reply." },
  { role: "user",      content: "Follow-up question." },
];
 
const result = truncateMessagesToFit(msgs);
// system message preserved, oldest oversized message dropped,
// two most-recent messages retained.
console.assert(result[0].role === "system");
console.assert(result.length === 3); // system + last assistant + last user

@HasanAlsaafen HasanAlsaafen force-pushed the feat/context-window-truncatio branch from e25e0c2 to 6e0db9b Compare May 9, 2026 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

context window exceeds limit

2 participants