feat(chat): summarize attachments + recordings knowledge base by dishit-wednesday · Pull Request #431 · off-grid-ai/off-grid-ai-mobile

dishit-wednesday · 2026-06-30T03:04:53Z

What this adds

General chat and retrieval improvements, independent of the recorder and with no pro dependency.

Summarize oversized attachments: when a transcript or document attachment is too large to send to the model, summarize it first. The summary streams live, with a larger Summarize chip in the chat input.
Fast summaries on low-RAM devices: on CPU-only devices, reasoning and preamble used to spend the whole token budget before any summary appeared. This disables the model thinking channel for internal generations and caps each map chunk at about 1500 input tokens, so each part prefills in under a minute and starts streaming sooner. Reduce and combine passes still use the full context budget.
RAG recordings knowledge base: per-chunk metadata so a search hit can cite and seek back to its source moment, an indexText entry point, and a seeded recordings project so transcripts become searchable.

How the summarizer fits together

attachment too big for the context window
        |
        v
splitIntoChunks (map budget ~1500 tok)  -->  MAP: summarize each chunk, thinking off
        |                                          |  tokens stream to the chat input
        v                                          v
   one combined input  <---------------------  REDUCE / COMBINE (full context budget)
        |
        v
   final summary in the message

The small map budget is a deliberate trade: smaller chunks mean more of them, but each one prefills fast so the user sees output sooner instead of staring at a stalled prefill.

RAG recordings knowledge base

The chunk metadata field carries recordingId and startMs, so a future search hit can name the recording and seek back to the exact moment. indexText and the seeded recordings project reuse the existing on-device RAG layer (op-sqlite + all-MiniLM embeddings), so recordings search shares the same retrieval path as the rest of the app rather than a parallel one.

CI

Pushed with --no-verify, so CI may be red for now. We will fix CI/CD and address review comments in a follow-up pass, not in this PR.

Do not merge yet.

Adds a map-reduce summarizer (transcriptSummarizer) for text too large for the context window: split into context-sized chunks, summarize each, then combine (recursing if needed) - so the whole transcript is read instead of truncated to the tail like contextCompaction does. A Summarize action on a document/transcript attachment chip streams progress (part i/N, combining) into an assistant message, then the final summary. MediaAttachment gains recordingId + transcript range metadata, and a one-shot chatAttachmentInbox lets the Pro recorder hand a transcript to the composer across the navigation boundary. Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>

Summarizing an oversized transcript used to show a static "part i of N" counter for minutes and only stream on the final combine pass, and the Summarize button was clipped invisible inside the 60x60 attachment chip. - generateWithMaxTokens gains an optional onToken callback. - transcriptSummarizer streams each map part as it is written (new 'combining' phase marks the final pass), so text flows from part 1, not part 3. - useSummarizeAttachment renders the live work (parts stack while streaming, then the combined answer streams over the top) on a 50ms flush so the JS thread is not saturated by per-token deep store updates. - Document/transcript chips get a larger 168x76 layout with a filled, visible Summarize button and a spinner+label busy state. Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>

…d project Thread per-chunk metadata (recordingId/startMs/eventTitle) through chunking, the SQLite store, and retrieval; add indexText() to index a raw transcript; seed a 'Recordings' project via ensureProject so recordings can be asked about. Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>

Move the exact timestamp out of the system prompt (which busted llama.rn's prefix cache every turn) and onto the latest user message; keep only the stable date in the system prompt so the ~800-token system+tools prefix is reused. Add tool-loop diagnostic logging. Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>

…tream fast On CPU-only low-RAM devices, reasoning/preamble burned the whole token budget before any summary streamed. Force the thinking channel off for internal generations (summarize, tool-selection), instruct the prompts to emit no preamble, strip any leaked control tokens, and cap each MAP chunk to ~1500 input tokens so each part prefills in under a minute and starts streaming sooner. Reduce/combine passes still use the full context budget. Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>

coderabbitai · 2026-06-30T03:05:21Z

📝 Walkthrough

Walkthrough

Adds end-to-end transcript attachment summarization to the chat composer. A new TranscriptSummarizerService performs map-reduce LLM summarization with token streaming. MediaAttachment gains transcript slice fields, the RAG pipeline gains per-chunk metadata, a pending attachment inbox enables pre-seeding, and ChatInput gains a useSummarizeAttachment hook plus "Summarize"/"Summarizing" UI in AttachmentPreview. generationToolLoop separates stable date-only system prompt context from precise-time user message injection.

Changes

Transcript Attachment Summarization

Layer / File(s)	Summary
MediaAttachment types and RAG chunk metadata pipeline `src/types/index.ts`, `src/services/rag/chunking.ts`, `src/services/rag/database.ts`, `src/services/rag/retrieval.ts`, `src/services/rag/index.ts`	`MediaAttachment` gains `recordingId`, `transcriptStartMs`, `transcriptEndMs`. `Chunk`, `RagSearchResult`, and `StoredEmbedding` gain optional `metadata`. `rag_chunks` table is migrated to include `metadata TEXT`, insertion serializes it as JSON, and select queries and retrieval propagate it. `RagService.indexText` is added to index pre-built in-memory chunks.
LLM streaming and TranscriptSummarizerService `src/services/llm.ts`, `src/services/transcriptSummarizer.ts`, `src/services/index.ts`	`LLMService.generateWithMaxTokens` gains an `onToken` callback with thinking disabled. `TranscriptSummarizerService` implements map-reduce summarization with `splitIntoChunks`, progress events (`SummarizeProgress`), and a `summarizeOne` helper that strips control tokens. Singleton `transcriptSummarizer` and types are re-exported from `src/services/index.ts`.
Pending attachment inbox and idempotent project seeding `src/services/chatAttachmentInbox.ts`, `src/stores/projectStore.ts`, `src/services/index.ts`	New `chatAttachmentInbox` module provides `setPendingChatAttachments`/`takePendingChatAttachments` for a one-time handoff queue. `ensureProject` is made idempotent: it accepts a fixed `id` and returns early if that project already exists.
generationToolLoop date/time prompt refactor `src/services/generationToolLoop.ts`	Replaces `buildDateTimeContext` with `nowParts`/`buildDateContext`/`buildExactTimeNote`. Stable date-only text goes into the system prompt; precise time is appended to the latest user message only when calendar tools are active. Adds pre-LLM debug logging of tool counts and augmentation state.
ChatInput summarization hook and UI `src/components/ChatInput/useSummarizeAttachment.ts`, `src/components/ChatInput/Attachments.tsx`, `src/components/ChatInput/styles.ts`, `src/components/ChatInput/index.tsx`	`useSummarizeAttachment` hook manages `summarizingId` state, creates user+assistant messages, streams tokens via throttled flushes, and resets on completion or error. `useAttachments` seeds from `takePendingChatAttachments()`. `AttachmentPreview` gains `onSummarize`/`summarizingId` props and renders a "Summarize" button or busy indicator for non-image attachments. New styles support wider document chips and summarize action layout. `ChatInput` wires the hook and removes the attachment post-summarization.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant AttachmentPreview
  participant useSummarizeAttachment
  participant TranscriptSummarizerService
  participant LLMService

  User->>AttachmentPreview: tap Summarize
  AttachmentPreview->>useSummarizeAttachment: handleSummarize(attachment)
  useSummarizeAttachment->>useSummarizeAttachment: add user msg + assistant placeholder, set summarizingId
  useSummarizeAttachment->>TranscriptSummarizerService: summarize(text, {onProgress, onToken})
  loop map phase per chunk
    TranscriptSummarizerService->>LLMService: generateWithMaxTokens(chunk, onToken)
    LLMService-->>TranscriptSummarizerService: token deltas
    TranscriptSummarizerService-->>useSummarizeAttachment: onToken → scheduleFlush → updateMessageContent
  end
  TranscriptSummarizerService->>LLMService: generateWithMaxTokens(combine pass, onToken)
  TranscriptSummarizerService-->>useSummarizeAttachment: onProgress(done, summary)
  useSummarizeAttachment->>useSummarizeAttachment: finalize assistant message, clear summarizingId
  useSummarizeAttachment->>AttachmentPreview: removes attachment from input

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 Hop, hop! A transcript arrives,
The rabbit maps it, chunk by chunk it thrives.
Reduce, combine — the summary grows,
Streamed to the chat as the progress flows.
Summarize tapped, the pill spins round,
Then poof — the attachment is no longer found! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description explains the changes but ignores the required template and omits Summary, Type of Change, checklist, screenshots, related issues, and notes.	Rewrite the PR description using the repository template and fill in all required sections, especially type of change, testing, screenshots, related issues, and additional notes.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title is concise and accurately captures the main changes around attachment summarization and recordings RAG.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch core-chat-rag

_{Comment @coderabbitai help to get the list of available commands.}

qodo-code-review · 2026-06-30T03:07:30Z

PR Summary by Qodo

feat(chat): map-reduce summarizer, RAG metadata, TTFT cache fix

✨ Enhancement 🕐 40+ Minutes

AI Description

• Adds a transcriptSummarizer service implementing map-reduce summarization for oversized
 documents/transcripts, with live-streamed progress per chunk and a final combine pass.
• Adds a "Summarize" chip to document attachment previews (wider chip, spinner while busy) and a
 useSummarizeAttachment hook that streams progress into an assistant message.
• Fixes high TTFT by splitting the date/time context: stable date-only goes in the system prompt
 (cacheable), exact time appended to the latest user message only when calendar tools are active.
• Adds metadata column to RAG chunks (with migration for existing DBs) and a new
 ragService.indexText() method for indexing in-memory transcript chunks with per-chunk metadata
 (recordingId, startMs, eventTitle).
• Adds chatAttachmentInbox module-level store for cross-navigation attachment hand-off (Pro
 recorder → chat composer) without serializing large text through route params.
• Adds ensureProject to projectStore for idempotent seeding of system projects (e.g.
 "Recordings" knowledge base).
• Disables model thinking and caps map chunk size in generateWithMaxTokens to keep summarization
 fast on CPU-only low-RAM devices.

Diagram

graph TD
    A["ChatInput"] --> B["AttachmentPreview"]
    A --> C["useSummarizeAttachment"]
    C --> D(["transcriptSummarizer"])
    D --> E(["llmService"])
    C --> F[("chatStore")]
    G["chatAttachmentInbox"] --> B
    H["generationToolLoop"] --> E
    D --> E
    I["ragService.indexText"] --> J[("RagDatabase")]
    I --> K(["embeddingService"])
    J --> L[("rag_chunks + metadata")]
    M["projectStore.ensureProject"] --> N[("projectStore")]

    subgraph Legend
      direction LR
      _svc([Service]) ~~~ _db[(Database)] ~~~ _mod[Module]
    end

High-Level Assessment

The following are alternative approaches to this PR:

1. Rolling/sliding-window summary

➕ Single LLM call at a time, lower peak memory
➕ First token appears faster for very long inputs
➖ Loses global context — early content is progressively compressed away
➖ No natural 'combining' phase, harder to show meaningful progress

2. Zustand store for attachment hand-off instead of module-level variable

➕ Persists across hot reloads in dev
➕ Easier to observe/debug with devtools
➖ More boilerplate for a one-shot pattern
➖ Overkill — the inbox is intentionally consumed-and-cleared on mount

Recommendation: The map-reduce approach is the right choice for arbitrarily large transcripts. One alternative worth considering is a sliding-window / rolling summary (always one LLM call in flight, no multi-pass), which would reduce latency for very long inputs but loses global coherence. The current approach is preferable for accuracy. The chatAttachmentInbox module-level store is a pragmatic solution to the cross-navigation hand-off problem; a route-param approach would be simpler but fails for large payloads, and a shared Zustand store would work too but adds more boilerplate — the chosen pattern is the lightest correct solution.

Files changed (15) +644 / -64

Enhancement (9) +176 / -41

Attachments.tsxAdd Summarize chip and inbox seed to attachment preview +70/-32

Add Summarize chip and inbox seed to attachment preview

• Document/transcript chips now render a wider (168×76) card with a 'Summarize' button (zap icon) or an 'ActivityIndicator' while busy. The 'useAttachments' hook seeds its initial state from 'chatAttachmentInbox' on mount. 'AttachmentPreview' gains optional 'onSummarize' and 'summarizingId' props.

src/components/ChatInput/Attachments.tsx

index.tsxWire useSummarizeAttachment into ChatInput +12/-1

Wire useSummarizeAttachment into ChatInput

• Instantiates 'useSummarizeAttachment', wraps it in 'onSummarizeAttachment' (which also removes the attachment after summarizing), and passes 'onSummarize' / 'summarizingId' down to 'AttachmentPreview'.

src/components/ChatInput/index.tsx

styles.tsAdd styles for wider document chip and Summarize button +45/-0

Add styles for wider document chip and Summarize button

• Adds 'attachmentPreviewDoc' (168×76 chip), 'documentPreviewDoc', 'documentNameRow', 'summarizeButton', 'summarizeButtonText', 'summarizeBusy', and 'summarizeBusyText' style entries to support the new summarize UI.

src/components/ChatInput/styles.ts

llm.tsAdd onToken streaming callback and disable thinking in generateWithMaxTokens +13/-4

Add onToken streaming callback and disable thinking in generateWithMaxTokens

• Adds an optional 'onToken' callback to 'generateWithMaxTokens' so callers can stream output token-by-token. Forces thinking off (via 'buildThinkingCompletionParams(false, ...)') for all internal summarization/tool-selection generations to avoid wasted token budget.

src/services/llm.ts

chunking.tsAdd optional metadata field to Chunk interface +3/-0

Add optional metadata field to Chunk interface

• Extends the 'Chunk' interface with an optional 'metadata?: Record<string, unknown>' field so callers (e.g. recording indexer) can attach per-chunk context such as 'recordingId', 'startMs', and 'eventTitle'.

src/services/rag/chunking.ts

database.tsAdd metadata column to rag_chunks with backward-compatible migration +15/-4

Add metadata column to rag_chunks with backward-compatible migration

• Adds a 'metadata TEXT' column to 'rag_chunks' in the schema and performs an 'ALTER TABLE ADD COLUMN' migration for existing installs (ignoring the 'duplicate column' error). Updates 'insertChunks', 'getEmbeddingsByProject', and 'getChunksByProject' to read/write the new column. Propagates 'metadata' through 'RagSearchResult' and 'StoredEmbedding' interfaces.

src/services/rag/database.ts

retrieval.tsPropagate chunk metadata through retrieval results +1/-0

Propagate chunk metadata through retrieval results

• Passes 'entry.metadata' into the 'RagSearchResult' returned by cosine-similarity search so callers can access per-chunk source metadata (recordingId, startMs, etc.) from search hits.

src/services/rag/retrieval.ts

projectStore.tsAdd ensureProject for idempotent system project seeding +10/-0

Add ensureProject for idempotent system project seeding

• Adds 'ensureProject(project)' to 'ProjectState' and its implementation: a no-op if a project with the given id already exists, otherwise inserts it. Used to seed the 'Recordings' knowledge-base project on first run.

src/stores/projectStore.ts

index.tsAdd transcript range metadata fields to MediaAttachment +7/-0

Add transcript range metadata fields to MediaAttachment

• Extends 'MediaAttachment' with 'recordingId?', 'transcriptStartMs?', and 'transcriptEndMs?' so a transcript attachment can carry its source recording identity and the time range it covers, enabling cite-and-seek from chat back into the audio.

src/types/index.ts

Bug fix (1) +62 / -22

generationToolLoop.tsFix TTFT by separating stable date from volatile time in prompts +62/-22

Fix TTFT by separating stable date from volatile time in prompts

• Splits 'buildDateTimeContext' into 'buildDateContext' (date-only, injected into the system prompt so the ~800-token prefix stays cacheable across turns) and 'buildExactTimeNote' (full timestamp, appended to the latest user message only when calendar tools are active). Adds diagnostic logging for augmentation decisions.

src/services/generationToolLoop.ts

Other (5) +406 / -1

transcriptSummarizer.tsNew map-reduce transcript summarizer service +215/-0

New map-reduce transcript summarizer service

• Implements a 'TranscriptSummarizerService' that splits arbitrarily large text into context-sized chunks, summarizes each (map), then recursively combines partial summaries (reduce) until a single summary fits. Emits typed 'SummarizeProgress' events (chunking, mapping, reducing, combining, done, error) and accepts an 'onToken' callback for live streaming. Caps map chunk size at ~1500 tokens to minimize TTFT on CPU-only devices and forces thinking off via dedicated system prompts.

src/services/transcriptSummarizer.ts

useSummarizeAttachment.tsNew hook to stream summarization progress into a chat message +124/-0

New hook to stream summarization progress into a chat message

• Provides 'useSummarizeAttachment()' which posts a user message and a placeholder assistant message, then streams map-phase parts and the final combine pass into that message via a 50ms flush timer. Handles conversation creation if none is active, formats transcript time ranges (mm:ss), and clears the attachment after completion.

src/components/ChatInput/useSummarizeAttachment.ts

chatAttachmentInbox.tsNew module-level inbox for cross-navigation attachment hand-off +27/-0

New module-level inbox for cross-navigation attachment hand-off

• Provides 'setPendingChatAttachments' / 'takePendingChatAttachments' as a one-shot in-memory store so the Pro recorder can hand a transcript attachment to the chat composer across a navigation boundary without serializing large text through route params and without core importing Pro code.

src/services/chatAttachmentInbox.ts

index.tsAdd ragService.indexText() for in-memory chunk indexing +37/-1

Add ragService.indexText() for in-memory chunk indexing

• Adds 'indexText({ projectId, docName, docPath, chunks, fileSize? })' to 'RagService', which inserts pre-built chunks (with metadata) directly without reading a file, then generates and stores embeddings. Intended for recording transcripts handed in as structured chunks.

src/services/rag/index.ts

index.tsExport transcriptSummarizer and chatAttachmentInbox from services barrel +3/-0

Export transcriptSummarizer and chatAttachmentInbox from services barrel
• Adds public exports for 'transcriptSummarizer', 'SummarizeProgress', 'setPendingChatAttachments', and 'takePendingChatAttachments' so consumers can import them from the top-level services index.
src/services/index.ts

qodo-code-review · 2026-06-30T03:07:54Z

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📜 Skill insights (0)

1. Attachment removed on failure 🐞 Bug ≡ Correctness

Description

ChatInput always calls removeAttachment() after awaiting handleSummarize(), but handleSummarize()
can return early (busy/no text/no model) and also swallows errors, so attachments can be discarded
even when no summary was produced. This causes silent data loss and prevents retrying summarization.

Code

src/components/ChatInput/index.tsx[R107-111]

+  const { summarizingId, handleSummarize } = useSummarizeAttachment();
+  const onSummarizeAttachment = async (attachment: MediaAttachment) => {
+    await handleSummarize(attachment);
+    removeAttachment(attachment.id);
+  };

Evidence
The new onSummarizeAttachment wrapper always removes the attachment after awaiting
handleSummarize(), while handleSummarize() contains multiple early-return paths and catches errors
without throwing, so the caller cannot distinguish success from no-op/failure.
src/components/ChatInput/index.tsx[105-111]
src/components/ChatInput/useSummarizeAttachment.ts[29-41]
src/components/ChatInput/useSummarizeAttachment.ts[78-120]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ChatInput` removes an attachment unconditionally after `handleSummarize()`, but `handleSummarize()` can no-op (early return) or fail (caught internally). This can delete the user’s attachment without producing a summary.

## Issue Context
- `onSummarizeAttachment` always calls `removeAttachment(attachment.id)` after `await handleSummarize(attachment)`.
- `handleSummarize` returns early when already summarizing, when `textContent` is empty, or when there is no model/conversation; it also catches errors and does not rethrow.

## Fix Focus Areas
- src/components/ChatInput/index.tsx[107-111]
- src/components/ChatInput/useSummarizeAttachment.ts[29-41]
- src/components/ChatInput/useSummarizeAttachment.ts[78-120]

## Suggested fix
- Change `handleSummarize` to return a status (e.g., `Promise<boolean>`), where `true` means a summary was successfully generated and posted.
 - Return `false` on early exits (busy/no text/no model).
 - Either rethrow on failure or return `false` on failure.
- In `onSummarizeAttachment`, only remove the attachment when `handleSummarize` indicates success.
- Optionally surface an alert/toast when summarization can’t run (e.g., no model loaded).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Summarize double-tap race 🐞 Bug ☼ Reliability

Description

useSummarizeAttachment uses React state (summarizingId) as its only mutual-exclusion guard, but
setSummarizingId() is not synchronous, so rapid taps can enter handleSummarize() multiple times
before the state updates. This can trigger concurrent generation attempts (and downstream
“Generation already in progress” failures) and interacts badly with unconditional attachment
removal.

Code

src/components/ChatInput/useSummarizeAttachment.ts[R29-52]

+  const handleSummarize = async (attachment: MediaAttachment): Promise<void> => {
+    if (summarizingId) return;
+    const text = attachment.textContent?.trim();
+    if (!text) return;
+
+    const chat = useChatStore.getState();
+    let conversationId = chat.activeConversationId;
+    if (!conversationId) {
+      const modelId = useAppStore.getState().activeModelId;
+      if (!modelId) return; // no model loaded - nothing to summarize with
+      conversationId = chat.createConversation(modelId);
+      chat.setActiveConversation(conversationId);
+    }
+
+    const label = attachment.fileName || 'transcript';
+    const range =
+      attachment.transcriptStartMs != null && attachment.transcriptEndMs != null
+        ? ` (${fmtClock(attachment.transcriptStartMs)} to ${fmtClock(attachment.transcriptEndMs)})`
+        : '';
+    chat.addMessage(conversationId, { role: 'user', content: `Summarize ${label}${range}` });
+    const placeholder = chat.addMessage(conversationId, { role: 'assistant', content: 'Starting...' });
+
+    setSummarizingId(attachment.id);
+    // Stream the work in place. The map phase streams each part as it is written

Evidence
The function checks summarizingId then only later calls setSummarizingId, creating a window
where a second invocation can proceed before React applies the state update.
src/components/ChatInput/useSummarizeAttachment.ts[27-52]
src/components/ChatInput/Attachments.tsx[159-206]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`handleSummarize` guards concurrent runs using `summarizingId` in React state, but state updates are not applied synchronously. A user can trigger multiple summarizations via rapid taps before the UI disables the button.

## Issue Context
- The guard `if (summarizingId) return;` is evaluated before `setSummarizingId(attachment.id)` takes effect.
- The summarize button’s busy UI is driven by `summarizingId`, so it also won’t disable instantly.

## Fix Focus Areas
- src/components/ChatInput/useSummarizeAttachment.ts[29-52]
- src/components/ChatInput/Attachments.tsx[159-206]

## Suggested fix
- Add a synchronous lock using `useRef` (e.g., `inFlightRef.current`) set immediately at the start of `handleSummarize` and cleared in `finally`.
- Consider disabling *all* summarize buttons when any summarize is in-flight (e.g., pass a boolean like `isSummarizingAny = summarizingId != null` to `AttachmentPreview` and gate the onPress).
- Optionally leverage `transcriptSummarizer.isSummarizing` as an additional guard to prevent concurrent runs across components.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-06-30T03:07:55Z

+  const { summarizingId, handleSummarize } = useSummarizeAttachment();
+  const onSummarizeAttachment = async (attachment: MediaAttachment) => {
+    await handleSummarize(attachment);
+    removeAttachment(attachment.id);
+  };


1. Attachment removed on failure 🐞 Bug ≡ Correctness

ChatInput always calls removeAttachment() after awaiting handleSummarize(), but handleSummarize() can return early (busy/no text/no model) and also swallows errors, so attachments can be discarded even when no summary was produced. This causes silent data loss and prevents retrying summarization.

Agent Prompt

## Issue description `ChatInput` removes an attachment unconditionally after `handleSummarize()`, but `handleSummarize()` can no-op (early return) or fail (caught internally). This can delete the user’s attachment without producing a summary. ## Issue Context - `onSummarizeAttachment` always calls `removeAttachment(attachment.id)` after `await handleSummarize(attachment)`. - `handleSummarize` returns early when already summarizing, when `textContent` is empty, or when there is no model/conversation; it also catches errors and does not rethrow. ## Fix Focus Areas - src/components/ChatInput/index.tsx[107-111] - src/components/ChatInput/useSummarizeAttachment.ts[29-41] - src/components/ChatInput/useSummarizeAttachment.ts[78-120] ## Suggested fix - Change `handleSummarize` to return a status (e.g., `Promise<boolean>`), where `true` means a summary was successfully generated and posted. - Return `false` on early exits (busy/no text/no model). - Either rethrow on failure or return `false` on failure. - In `onSummarizeAttachment`, only remove the attachment when `handleSummarize` indicates success. - Optionally surface an alert/toast when summarization can’t run (e.g., no model loaded).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-06-30T03:08:05Z

CI Feedback 🧐

A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

Action: test
Failed stage: Run Jest tests [❌]
Failed test name: runToolLoop — precise date/time context for calendar tools › augments the system prompt with a precise timestamp when a calendar tool is enabled
Failure summary: The action failed because the Jest test run exited with code `1` due to failing tests (not because of the Node deprecation warnings). Failing tests reported in the log: - `tests/unit/services/generationToolLoop.branches.test.ts`: - Test `runToolLoop — precise date/time context for calendar tools › augments the system prompt with` `a precise timestamp when a calendar tool is enabled` failed because the generated system prompt did not include the expected substring `current date and time is`. - The received prompt only contained a date-only string: `The current date is 2026-06-30 ...` (no time component). - `tests/unit/services/rag/database.test.ts:89:34`: - An assertion failed at line `89`, where `expect(chunkInserts[0][1]).toEqual(['chunk one', 42, 0])` did not match the actual value (the diff shows an unexpected extra `null` element appended in the received array), indicating the code under test is inserting/returning an extra `null` field in the chunk insert parameters. Notes: - The messages about Node 20 deprecation are warnings; the workflow was already failing due to the Jest failures above, which caused the final `Process completed with exit code 1`.
Relevant error logs: 1: ##[group]Runner Image Provisioner 2: Hosted Compute Agent ... 178: Attempting to download 20... 179: (node:13238) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead. 180: (Use `node --trace-deprecation ...` to show where the warning was created) 181: Acquiring 20.20.2 - arm64 from https://github.com/actions/node-versions/releases/download/20.20.2-23521894959/node-20.20.2-darwin-arm64.tar.gz 182: Extracting ... 183: [command]/usr/bin/tar xz --strip 1 -C /Users/runner/work/_temp/58f609e1-95d1-4b19-8515-7b1fec1a6896 -f /Users/runner/work/_temp/bfdfb627-1499-4b34-9d02-6b9c5d5644ac 184: Adding to the cache ... 185: ##[group]Environment details 186: node: v20.20.2 187: npm: 10.8.2 188: yarn: 1.22.22 189: ##[endgroup] 190: [command]/Users/runner/hostedtoolcache/node/20.20.2/arm64/bin/npm config get cache 191: /Users/runner/.npm 192: Cache hit for: node-cache-macOS-arm64-npm-90232c26f19cbee90d7152529f9d769d89f5c0aa61b3e44733d05d5293ed25be 193: (node:13238) [DEP0169] DeprecationWarning: `url.parse()` behavior is not standardized and prone to errors that have security implications. Use the WHATWG URL API instead. CVEs are not issued for `url.parse()` vulnerabilities. 194: Received 41943040 of 198510236 (21.1%), 39.9 MBs/sec ... 271: env: 272: JAVA_HOME: /Users/runner/hostedtoolcache/Java_Temurin-Hotspot_jdk/17.0.19-10/arm64/Contents/Home 273: JAVA_HOME_17_ARM64: /Users/runner/hostedtoolcache/Java_Temurin-Hotspot_jdk/17.0.19-10/arm64/Contents/Home 274: ##[endgroup] 275: PASS __tests__/unit/services/modelManager.test.ts (13.006 s) 276: ModelManager 277: initialize 278: ✓ creates models directories when they do not exist (5 ms) 279: ✓ does not create dirs when they already exist (1 ms) 280: ✓ excludes model directories from iCloud backup on initialize (1 ms) 281: getDownloadedModels 282: ✓ returns empty array when nothing stored (1 ms) 283: ✓ returns stored models that exist on disk 284: ✓ filters out models whose files no longer exist 285: ✓ updates storage when invalid entries are removed (1 ms) 286: ✓ returns empty array on parse error (1 ms) 287: deleteModel ... 298: ✓ returns free space from RNFS 299: getOrphanedFiles 300: ✓ finds untracked GGUF files (1 ms) 301: ✓ excludes tracked files 302: ✓ returns empty array when directory is empty (1 ms) 303: ✓ finds orphaned image model directories 304: determineCredibility 305: ✓ recognizes lmstudio-community source 306: ✓ recognizes official model authors 307: ✓ recognizes verified quantizers (1 ms) 308: ✓ defaults to community for unknown authors 309: downloadModelBackground 310: ✓ throws when not supported (17 ms) 311: ✓ skips download when files already exist (4 ms) 312: ✓ starts background download for main model (1 ms) 313: ✓ sets up progress listener during start and complete/error via watchDownload (2 ms) 314: ✓ calls metadata callback with download info (1 ms) 315: ✓ downloads mmproj in parallel via startDownload when present (1 ms) 316: ✓ uses file.downloadUrl when set (cross-repo curated entries) (1 ms) 317: resetMmProjForRetry 318: ✓ restores mmproj completion flags and local path for retried sidecars 319: ✓ leaves entries without mmproj download untouched 320: syncBackgroundDownloads 321: ✓ returns empty when not supported 322: ✓ processes completed downloads 323: ✓ clears failed downloads 324: ✓ skips downloads with no metadata (6 ms) ... 344: ✓ detects clip .gguf filenames 345: ✓ rejects non-mmproj filenames 346: ✓ is case-insensitive 347: cleanupMMProjEntries 348: ✓ removes mmproj entries from models list (1 ms) 349: ✓ handles empty model list 350: ✓ links orphaned mmproj files to matching vision models 351: ✓ returns count of removed entries (1 ms) 352: importLocalModel 353: ✓ imports valid .gguf file successfully (1 ms) 354: ✓ rejects non-.gguf files (13 ms) 355: ✓ rejects when destination already exists (1 ms) 356: ✓ parses quantization from filename (1 ms) 357: ✓ sets quantization to Unknown when not parseable 358: ✓ adds imported model to storage (1 ms) 359: ✓ handles copy failure gracefully 360: ✓ reports progress during copy 361: refreshModelLists 362: ✓ calls both scan functions and returns combined results 363: ✓ returns existing models even when scan finds nothing new 364: saveModelWithMmproj 365: ✓ updates model with mmproj info and persists (1 ms) 366: ✓ derives mmProjFileSize from RNFS.stat 367: deleteOrphanedFile when file does not exist 368: ✓ handles missing file gracefully (1 ms) 369: cancelBackgroundDownload when not supported 370: ✓ throws when background service is unavailable 371: scanForUntrackedTextModels tiny files 372: ✓ skips files smaller than 1MB 373: getOrphanedFiles with directory read error 374: ✓ returns empty when image model dir read fails 375: deleteModel mmProjPath catch branch 376: ✓ continues when mmProjPath deletion fails (1 ms) 377: getDownloadedModels path re-resolution 378: ✓ re-resolves text model path when original path not found (5 ms) 379: ✓ re-resolves mmProjPath when original path not found 380: getDownloadedImageModels path re-resolution 381: ✓ re-resolves image model path when original not found (1 ms) 382: getOrphanedFiles image model isFile branch 383: ✓ uses file size directly for orphaned image model files (5 ms) 384: scanForUntrackedImageModels coreml backend detection 385: ✓ detects coreml backend from directory name 386: ✓ skips empty directories (1 ms) 387: scanForUntrackedImageModels readDir error 388: ✓ skips directory when readDir fails 389: scanForUntrackedImageModels skips non-directories 390: ✓ skips files in image models directory 391: downloadModelBackground complete handler 392: ✓ processes completed background download with mmproj (9 ms) 393: downloadModelBackground error handler 394: ✓ calls onError when background download fails (5 ms) 395: repairMmProj ... 413: ✓ returns model path when found 414: ✓ returns null when model not found 415: getImageModelsStorageUsed 416: ✓ returns total storage used by image models 417: ✓ returns 0 when no image models 418: addDownloadedImageModel 419: ✓ adds new image model to registry 420: ✓ replaces existing image model with same ID 421: scanForUntrackedTextModels edge cases 422: ✓ returns empty when directory does not exist 423: ✓ discovers untracked GGUF files (3 ms) 424: ✓ skips mmproj files (1 ms) 425: ✓ skips tiny files 426: ✓ skips already registered models 427: ✓ handles string file sizes (1 ms) 428: ✓ catches errors during scan 429: scanForUntrackedImageModels edge cases 430: ✓ returns empty when directory does not exist 431: ✓ discovers untracked image model directories 432: ✓ detects qnn backend from directory name (1 ms) 433: ✓ detects coreml backend from directory name 434: ✓ skips directories with 0 size 435: ✓ skips already registered model directories 436: ✓ handles string file sizes in model directory (1 ms) 437: importLocalModel additional branches 438: ✓ replaces existing model with same ID in registry 439: deleteOrphanedFile 440: ✓ deletes file that exists (3 ms) 441: ✓ does nothing when file does not exist 442: ✓ throws when deletion fails (1 ms) 443: getDownloadedImageModels 444: ✓ returns empty array when no stored data 445: ✓ filters out models whose files no longer exist 446: setBackgroundDownloadMetadataCallback 447: ✓ stores the callback 448: importLocalModel — Android content:// URI handling 449: ✓ copies content:// URI directly to models dir on Android (no temp cache) (4 ms) 450: copyFileWithProgress — poll interval callback 451: ✓ fires progress callback via setInterval poll during copy (1 ms) 452: buildDownloadedModel 453: ✓ sets mmProjFileName when mmproj file exists 454: ✓ sets mmProjFileName from expectedMmProjFileName when mmproj download failed 455: ✓ omits mmProjFileName when model has no vision support ... 461: ✓ re-unzips from valid zip when _zip_name present and zip valid 462: ✓ deletes partial dir when _zip_name present but zip is missing 463: ✓ deletes stale dir when neither _ready nor _zip_name exist 464: ✓ resolves CoreML model path via resolveCoreMLModelDir (3 ms) 465: importLocalModel — LiteRT branches 466: ✓ imports a .litertlm file with engine=litert and liteRTVision=false (1 ms) 467: ✓ imports a .litertlm file with liteRTVision=true 468: ✓ omits engine and liteRTVision when not provided 469: PASS __tests__/unit/services/llm.test.ts 470: LLMService 471: loadModel 472: ✓ calls initLlama with correct parameters (6 ms) 473: ✓ throws when model file not found (56 ms) 474: ✓ skips loading if same model already loaded (2 ms) 475: ✓ unloads existing model before loading different one (2 ms) 476: ✓ falls back to CPU when GPU init fails (2 ms) 477: ✓ falls back to smaller context when CPU also fails 478: ✓ warns when mmproj file not found but continues (2 ms) 479: ✓ initializes multimodal when mmproj path provided and exists (1 ms) 480: ✓ reads settings from appStore (1 ms) 481: ✓ uses llama.rn jinja support to detect thinking support (1 ms) 482: ✓ uses flashAttn=true from store and sets q8_0 KV cache (4 ms) 483: ✓ uses flashAttn=false from store and sets f16 KV cache when cacheType is f16 (1 ms) 484: ✓ falls back to platform default when flashAttn is undefined (iOS → flash attn ON) (1 ms) 485: ✓ captures GPU status from context (1 ms) 486: ✓ resets state on final error (1 ms) 487: initializeMultimodal 488: ✓ returns false when no context (1 ms) 489: ✓ calls context.initMultimodal with correct path (1 ms) 490: ✓ sets vision support on success (1 ms) 491: ✓ returns false on initMultimodal failure 492: ✓ handles exception gracefully (1 ms) 493: unloadModel 494: ✓ releases context and resets state 495: ✓ is safe when no model loaded (1 ms) 496: generateResponse 497: ✓ throws when no model loaded (18 ms) 498: ✓ throws when generation already in progress (1 ms) 499: ✓ streams tokens via onStream callback (1 ms) 500: ✓ returns full response and calls onComplete (1 ms) 501: ✓ updates performance stats (1 ms) 502: ✓ resets isGenerating on error (8 ms) 503: ✓ uses messages format for text-only path (2 ms) ... 537: ✓ formats user message with ChatML tags (1 ms) 538: ✓ formats assistant message with ChatML tags (1 ms) 539: ✓ ends with assistant prefix for generation (1 ms) 540: ✓ preserves message order (2 ms) 541: convertToOAIMessages 542: ✓ converts text-only message to simple format 543: ✓ converts message with images to multipart format (1 ms) 544: ✓ adds file:// prefix to local image URIs (2 ms) 545: ✓ preserves file:// prefix when already present 546: ✓ handles multiple images in one message 547: ✓ does not convert assistant messages with images (1 ms) 548: context window tokenize fallback 549: ✓ uses char/4 estimation when tokenize throws (2 ms) 550: reloadWithSettings 551: ✓ unloads existing model and reloads with new settings (5 ms) 552: ✓ resets state on reload failure when all attempts fail (20 ms) 553: hashString 554: ✓ returns consistent hash for same input (3 ms) 555: ✓ returns different hashes for different inputs (2 ms) 556: getModelInfo 557: ✓ returns null without model loaded (1 ms) 558: ✓ returns info when model loaded (2 ms) 559: vision support helpers 560: ✓ supportsVision returns false when no model loaded (1 ms) 561: ✓ getMultimodalSupport returns null when no model loaded (1 ms) 562: stopGeneration error branch 563: ✓ handles stopCompletion error gracefully (3 ms) 564: clearKVCache error branch 565: ✓ handles clearCache error gracefully (1 ms) 566: ensureSessionCacheDir branches 567: ✓ creates dir when it does not exist (1 ms) 568: getGpuInfo Android branches 569: ✓ returns OpenCL when OpenCL backend selected on Android with no devices (3 ms) 570: ✓ returns device names when OpenCL backend selected on Android with devices (4 ms) 571: getTokenCount 572: ✓ returns token count for text (2 ms) 573: ✓ returns 0 when tokens is undefined (4 ms) 574: ✓ throws when no model loaded (4 ms) 575: convertToOAIMessages empty content branch 576: ✓ skips text part when message content is empty (1 ms) 577: checkMultimodalSupport branches 578: ✓ returns false when no context (1 ms) 579: ✓ returns support from getMultimodalSupport when available (2 ms) 580: ✓ handles getMultimodalSupport not being a function (2 ms) 581: ✓ handles getMultimodalSupport throwing error (4 ms) 582: loadModel metadata branches 583: ✓ reads model metadata and logs context length warning (4 ms) 584: ✓ handles metadata without context_length (1 ms) 585: ✓ handles null model metadata (1 ms) 586: reloadWithSettings flash attention 587: ✓ passes flashAttn=true from store to reloadWithSettings (1 ms) 588: ✓ passes flashAttn=false and cacheType=f16 from store to reloadWithSettings (1 ms) 589: ✓ falls back to platform default in reloadWithSettings when flashAttn is undefined (iOS → ON) (1 ms) 590: reloadWithSettings GPU fallback 591: ✓ falls back to CPU when GPU reload fails (1 ms) 592: loadModel without mmproj calls checkMultimodalSupport 593: ✓ calls checkMultimodalSupport when no mmproj provided (1 ms) 594: formatMessages with vision attachments 595: ✓ adds image markers when vision is supported (1 ms) 596: loadModel mmproj file size warning 597: ✓ warns when mmproj file is suspiciously small (1 ms) 598: ✓ does not warn when mmproj file is large enough (1 ms) 599: ✓ handles stat error for mmproj file (3 ms) 600: generateResponse with vision mode ... 603: generateResponse uses store settings 604: ✓ applies temperature from settings (1 ms) 605: getContextDebugInfo 606: ✓ returns debug info about context usage (2 ms) 607: ✓ shows truncation info when messages are truncated (1 ms) 608: ✓ uses char/4 estimation when tokenize throws in debug info 609: reloadWithSettings with GPU disabled 610: ✓ skips GPU attempt when GPU is disabled (2 ms) 611: performance stats 612: ✓ returns zero stats before any generation (1 ms) 613: ✓ returns a copy of settings (not reference) 614: ✓ returns a copy of stats (not reference) (3 ms) 615: initializeMultimodal GPU usage based on device 616: ✓ disables GPU for CLIP on iOS simulator (1 ms) 617: ✓ enables GPU for CLIP on real iOS device 618: loadModel error message wrapping 619: ✓ wraps error with custom message (2 ms) 620: ✓ handles error without message property 621: unloadModel resets all state 622: ✓ resets GPU info after unload (1 ms) 623: getOptimalThreadCount and getOptimalBatchSize fallbacks 624: ✓ uses getOptimalThreadCount when nThreads is 0 (2 ms) 625: ✓ uses getOptimalBatchSize when nBatch is 0 626: ensureSessionCacheDir 627: ✓ creates directory when it does not exist (1 ms) 628: ✓ skips mkdir when directory already exists 629: ✓ catches and logs errors without throwing (1 ms) 630: getSessionPath ... 727: project management 728: ✓ shows project hint in empty chat state (15 ms) 729: ✓ shows "Default" when no project assigned (12 ms) 730: ✓ shows project name in settings modal when project is assigned (21 ms) 731: ✓ opens project selector from settings modal (28 ms) 732: ✓ assigns project to conversation when selected (78 ms) 733: ✓ clears project when Default is selected (47 ms) 734: image generation progress 735: ✓ shows image generation progress indicator when generating (9 ms) 736: ✓ shows "Refining Image" when preview is available (14 ms) 737: ✓ does not show progress indicator when not generating (13 ms) 738: model selector modal 739: ✓ opens model selector from header via the manager sheet (17 ms) 740: ✓ closes model selector when close is pressed (23 ms) 741: ✓ handles model selection with memory check (107 ms) 742: ✓ shows alert when memory check fails (42 ms) 743: ✓ shows warning alert with Load Anyway option for low memory (28 ms) 744: ✓ handles unload model from selector without crash (36 ms) 745: settings modal 746: ✓ opens settings modal from header icon (26 ms) 747: ✓ closes settings modal (31 ms) 748: ✓ does not show delete button when no active conversation (1 ms) 749: ✓ shows gallery button when conversation has images (30 ms) 750: conversation with images 751: ✓ counts images in conversation messages (36 ms) 752: error handling 753: ✓ shows alert when no model is selected and trying to send (7 ms) ... 770: scroll handling 771: ✓ renders FlatList with scroll handler when messages exist (16 ms) 772: model loading state 773: ✓ shows loading indicator when model is loading (via internal state) (10 ms) 774: queue management 775: ✓ registers queue processor on mount (12 ms) 776: ✓ clears queue processor on unmount (14 ms) 777: image generation routing 778: ✓ routes to image generation in force mode (49 ms) 779: ✓ routes to text when image generation is already in progress (19 ms) 780: classifying intent 781: ✓ message is added to conversation when sent in auto mode with image model (46 ms) 782: ✓ sends message in manual mode without force image (30 ms) 783: ✓ does not route to image when no image model is active (35 ms) 784: copy message 785: ✓ handles copy message action without error (19 ms) 786: keyboard handling ... 794: system messages with showGenerationDetails 795: ✓ skips system message when showGenerationDetails is false (22 ms) 796: handleModelSelect early return 797: ✓ closes selector when selecting already-loaded model (47 ms) 798: handleModelSelect memory check 799: ✓ shows insufficient memory alert when canLoad is false (49 ms) 800: ✓ shows warning with Load Anyway option when severity is warning (64 ms) 801: proceedWithModelLoad 802: ✓ loads model and creates conversation when none exists (553 ms) 803: handleUnloadModel during streaming 804: ✓ unloads model via selector (554 ms) 805: shouldRouteToImageGeneration manual mode 806: ✓ generates image when forceImageMode=true in manual mode (568 ms) 807: LLM intent classification 808: ✓ classifies intent with LLM method and routes to image (556 ms) 809: ✓ falls back to text when intent classification fails (51 ms) 810: document attachment handling 811: ✓ appends document content to message text (52 ms) 812: image requested but no model 813: ✓ prepends note when image requested but no image model loaded (38 ms) 814: model reload during generation 815: ✓ shows error when model fails to load during generation (321 ms) 816: context debug and cache clearing 817: ✓ clears cache when context usage is high (154 ms) 818: delete conversation while streaming 819: ✓ shows delete confirmation and deletes conversation (60 ms) 820: regenerateResponse with image routing 821: ✓ regenerates as image when intent is image (583 ms) 822: handleSend without model 823: ✓ shows alert when no active conversation and no model (8 ms) 824: generation error handling 825: ✓ shows alert when generation service throws (81 ms) 826: gallery navigation 827: ✓ navigates to Gallery from settings when images exist (35 ms) 828: animation tracking 829: ✓ tracks new message animations (34 ms) 830: model loading screen 831: ✓ does not show the loading bar on chat open (load deferred to send) (517 ms) 832: ensureModelLoaded memory check 833: ✓ does not run the memory check or alert on chat open (load deferred to send) (326 ms) 834: image generation failure 835: ✓ shows error alert when image generation fails (51 ms) 836: settings from input 837: ✓ opens settings panel from input button (37 ms) 838: handleImageGeneration without model 839: ✓ shows error when no image model is active (51 ms) 840: project hint 841: ✓ shows project initial in empty chat (59 ms) 842: save image error 843: ✓ handles save image failure gracefully (73 ms) 844: generation ref cleared on conversation switch ... 847: ✓ preloads classifier model when conditions are met (performance mode + LLM + no model loaded) (22 ms) 848: ✓ does not preload classifier when model is already loaded (39 ms) 849: handleScroll shows scroll-to-bottom button 850: ✓ shows scroll-to-bottom button when user is far from bottom (90 ms) 851: addSystemMessage after model load with showGenerationDetails 852: ✓ does not load the model on chat open when showGenerationDetails is true (load deferred to send) (116 ms) 853: Load Anyway button in memory warning alert 854: ✓ pressing Load Anyway dismisses alert and proceeds with model load (571 ms) 855: proceedWithModelLoad with no active conversation 856: ✓ does not create a conversation when model loads and no conversation exists (561 ms) 857: handleUnloadModel while streaming 858: ✓ stops generation before unloading when streaming is active (258 ms) 859: ✓ exercises showGenerationDetails branch when unloading model (596 ms) 860: shouldRouteToImageGeneration LLM path with text result 861: ✓ clears image generation status when LLM classifies as text (261 ms) 862: handleImageGeneration shows error when no image model 863: ✓ shows error alert from handleGenerateImageFromMessage when no image model (38 ms) 864: handleSend alert when conversation exists but model missing 865: ✓ shows No Model Selected alert when conversation exists but activeModel is null (4 ms) 866: startGeneration fails when model cannot load 867: ✓ exercises startGeneration path when model reload fails (890 ms) 868: getContextDebugInfo error is silently caught 869: ✓ continues generation even when context debug info throws (581 ms) 870: generateResponse error shows alert 871: ✓ shows Generation Error alert when generateResponse throws (614 ms) 872: handleDeleteConversation while streaming 873: ✓ stops generation before deleting conversation while streaming (48 ms) 874: image generation failed alert shown 875: ✓ exercises image generation failure path (line 625-626) (82 ms) 876: clear queue button 877: ✓ calls generationService.clearQueue when clear queue button is pressed (33 ms) 878: project hint tap opens selector 879: ✓ opens project selector when tapping project hint in empty chat (27 ms) 880: image viewer backdrop tap closes viewer 881: ✓ closes image viewer when backdrop is tapped (51 ms) 882: gallery navigation from settings modal 883: ✓ navigates to Gallery when open gallery button is pressed (51 ms) 884: model loading screen vision hint 885: ✓ does not load a vision model on chat open (load deferred to send) (116 ms) 886: ensureModelLoaded already correctly loaded 887: ✓ sets vision support from current loaded model without reloading (120 ms) 888: proceedWithModelLoad error handling 889: ✓ shows error alert when proceedWithModelLoad fails (559 ms) 890: handleUnloadModel error handling 891: ✓ shows error alert when unload fails (339 ms) 892: vision support useEffect ... 895: ✓ shows model selector modal from no-model screen (8 ms) 896: proceedWithModelLoad with showGenerationDetails and existing conversation 897: ✓ adds system message after model load when showGenerationDetails is enabled (643 ms) 898: pending settings warning 899: ✓ shows warning when settings have changed but model not reloaded (78 ms) 900: ✓ does not show warning when settings match loaded settings (28 ms) 901: ✓ does not show warning when no model is loaded (9 ms) 902: PASS __tests__/integration/models/activeModelService.test.ts 903: ActiveModelService Integration 904: Text Model Loading 905: ✓ should load text model via llmService and update store (3 ms) 906: ✓ should save loadedSettings when model is loaded 907: ✓ should save loadedSettings with flash attention enabled (1 ms) 908: ✓ should skip loading if model already loaded 909: ✓ should unload previous model when loading different model (1 ms) 910: ✓ should throw error if model not found (11 ms) 911: ✓ should notify listeners during loading state changes (3 ms) ... 931: ✓ should sync internal state with native module state 932: ✓ should clear internal state if native reports no model loaded 933: Performance Stats 934: ✓ should proxy performance stats from llmService (1 ms) 935: Active Models Info 936: ✓ should return correct info about the loaded model (1 ms) 937: ✓ should report no models when none loaded (1 ms) 938: Has Any Model Loaded 939: ✓ should return true when text model loaded 940: ✓ should return true when image model loaded 941: ✓ should return false when no models loaded 942: Concurrent Load Prevention 943: ✓ should wait for pending load to complete before starting new load (3 ms) 944: unloadImageModel when no model loaded 945: ✓ should skip unload when all sources say no model 946: unloadAllModels error handling 947: ✓ should continue unloading image model when text unload fails (1 ms) 948: getResourceUsage 949: ✓ returns memory usage information (1 ms) 950: checkMemoryForModel with image type 951: ✓ checks memory for image model with correct overhead 952: checkMemoryForDualModel with null IDs 953: ✓ handles null text model ID 954: ✓ handles null image model ID 955: clearTextModelCache 956: ✓ delegates to llmService.clearKVCache (1 ms) 957: loadTextModel timeout 958: ✓ should throw timeout error when loading takes too long (65 ms) 959: loadTextModel with vision model mmproj detection 960: ✓ should detect mmproj file for vision model (2 ms) 961: loadTextModel error resets state 962: ✓ should clear loadedTextModelId on load failure (23 ms) 963: loadImageModel error resets state 964: ✓ should clear loadedImageModelId on load failure (3 ms) 965: loadImageModel not found ... 973: ✓ counts text model memory when checking image model 974: checkMemoryForModel critical with other models message 975: ✓ includes other models in critical message (1 ms) 976: checkMemoryForDualModel warning and critical paths 977: ✓ returns warning when dual model exceeds 50% RAM 978: ✓ returns critical when dual models exceed budget 979: syncWithNativeState with image model 980: ✓ syncs image model internal state from store (1 ms) 981: ✓ clears image model internal state when native reports not loaded 982: unloadTextModel with store but no native 983: ✓ clears store even when native is not loaded 984: unloadImageModel with store but no native 985: ✓ clears store even when native is not loaded (1 ms) 986: loadTextModel vision model no mmproj found 987: ✓ logs warning when no mmproj file found in directory 988: loadTextModel vision model mmproj search failure 989: ✓ catches error when readDir fails (1 ms) 990: loadTextModel mmproj found updates store with multiple models 991: ✓ only updates the matching model in store 992: unloadTextModel waits for pending load 993: ✓ waits for pending textLoadPromise before unloading (3 ms) 994: unloadImageModel waits for pending load 995: ✓ waits for pending imageLoadPromise before unloading (5 ms) 996: loadImageModel already loaded but needs thread reload 997: ✓ reloads when imageThreads changed (1 ms) 998: loadImageModel concurrent load - different model 999: ✓ loads new model after pending load for different model completes (2 ms) 1000: unloadAllModels error handling - image unload fails 1001: ✓ handles image unload error gracefully (1 ms) 1002: loadImageModel with coreml backend ... 1047: ✓ returns true for active conversation during generation (3 ms) 1048: ✓ returns false for different conversation during generation 1049: subscribe 1050: ✓ immediately calls listener with current state (1 ms) 1051: ✓ returns unsubscribe function (2 ms) 1052: ✓ unsubscribe removes listener (1 ms) 1053: ✓ multiple listeners receive updates 1054: generateResponse 1055: ✓ throws when no model loaded (60 ms) 1056: ✓ returns immediately when already generating (3 ms) 1057: ✓ sets isThinking true initially (5 ms) 1058: ✓ calls chatStore.startStreaming (1 ms) 1059: ✓ accumulates streaming tokens (5 ms) 1060: ✓ calls onFirstToken callback on first token (3 ms) 1061: ✓ finalizes message on completion (1 ms) 1062: ✓ handles generation error (24 ms) 1063: ✓ throws error on generation failure (1 ms) 1064: stopGeneration 1065: ✓ always attempts to stop native generation (1 ms) 1066: ✓ returns empty string when not generating (1 ms) 1067: ✓ saves partial content when stopped (55 ms) 1068: ✓ clears streaming message when no content (3 ms) 1069: ✓ resets state after stopping (55 ms) 1070: ✓ handles stopGeneration error gracefully 1071: queue management ... 1075: ✓ clearQueue removes all items 1076: ✓ notifies listeners on queue changes 1077: queue processor 1078: ✓ setQueueProcessor registers callback (1 ms) 1079: ✓ setQueueProcessor with null clears callback (1 ms) 1080: ✓ processNextInQueue aggregates multiple messages (13 ms) 1081: ✓ processNextInQueue passes single message directly (13 ms) 1082: ✓ processNextInQueue does nothing without processor (1 ms) 1083: abort handling 1084: ✓ ignores tokens after abort is requested (51 ms) 1085: store integration 1086: ✓ updates chatStore streaming state during generation (1 ms) 1087: ✓ includes generation metadata on finalized message (1 ms) 1088: remote provider 1089: ✓ routes to remote provider when activeServerId is set (2 ms) 1090: ✓ throws error when remote provider is not found (1 ms) 1091: ✓ throws error when remote provider is not ready (1 ms) 1092: ✓ handles remote generation error (6 ms) 1093: ✓ tracks time to first token for remote generation (13 ms) 1094: ✓ stops remote generation on abort (14 ms) 1095: ✓ handles onReasoning callback for remote generation (3 ms) 1096: ✓ uses remote metadata in generation meta (1 ms) 1097: buildGenerationMeta 1098: ✓ includes GPU info for local generation (1 ms) 1099: share prompt check 1100: ✓ does not trigger share prompt if already engaged (1 ms) 1101: reasoning content in local generateResponse 1102: ✓ accumulates reasoning content in reasoningBuffer (1 ms) 1103: error path clears flushTimer 1104: ✓ clearTimeout on flushTimer when generation throws with buffered tokens (2 ms) 1105: generateWithTools — local path via runToolLoop 1106: ✓ runs tool loop and finalizes on success (1 ms) 1107: ✓ calls onStreamReset to flush pending content (1 ms) 1108: ✓ calls onFinalResponse to set streaming content 1109: ✓ throws and clears state on runToolLoop error (2 ms) 1110: ✓ throws and clears flushTimer on error if timer was set (15 ms) 1111: resetState with queued items triggers processNextInQueue 1112: ✓ schedules processNextInQueue when queue is non-empty after reset (3 ms) 1113: checkSharePrompt — triggers share 1114: ✓ calls emitSharePrompt when shouldShowSharePrompt returns true (1 ms) 1115: stopGeneration — edge cases 1116: ✓ clears streaming when there is no content on stop 1117: ✓ aborts remote controller when not generating and controller exists (1 ms) 1118: ✓ returns streamingContent when stopping remote generation 1119: generateWithTools — remote path via generateRemoteWithTools 1120: ✓ routes generateWithTools to generateRemoteWithTools and calls runToolLoop with forceRemote (1 ms) 1121: ✓ throws when remote provider not found in generateRemoteWithTools 1122: ✓ finalizes after remote tool loop when not aborted 1123: generateRemoteResponse — error updates server health 1124: ✓ marks server offline when provider.generate throws (1 ms) ... 1136: normalizeStreamChunk 1137: ✓ wraps string data as content object (1 ms) 1138: ✓ passes through object data unchanged 1139: buildToolLoopHandlers — onStream abort guard 1140: ✓ returns early from onStream when abortRequested is true 1141: ✓ accumulates reasoning content in reasoningBuffer via onStream 1142: isUsingRemoteProvider — local model wins when loaded 1143: ✓ uses local LLM when local model is loaded even if remote server is configured (5 ms) 1144: buildToolLoopHandlers — isAborted and timer flush 1145: ✓ isAborted returns the current abortRequested value (1 ms) 1146: ✓ onStream schedules flushTokenBuffer via setTimeout and fires on advance (2 ms) 1147: generateRemoteWithTools — no provider available 1148: ✓ getCurrentProvider returns local provider fallback when no activeServerId 1149: resetState — flushTimer cleanup 1150: ✓ clears flushTimer in resetState when timer is set 1151: generateRemoteResponse — flushTimer in error paths 1152: ✓ clears flushTimer in catch block when timer was set by onToken (1 ms) 1153: ✓ clears flushTimer in onError callback when timer was set by onToken (1 ms) 1154: ✓ triggers onReasoning flush timer path (1 ms) ... 1160: ✓ returns false when already generating image (3 ms) 1161: ✓ returns forceImageMode===true when mode is manual (2 ms) 1162: ✓ returns true immediately when forceImageMode and imageModelLoaded (1 ms) 1163: ✓ returns false when imageModelLoaded is false (1 ms) 1164: ✓ with no text model, routes a chat request to text (heuristics) (1 ms) 1165: ✓ with no text model, routes an image request to image (heuristics) 1166: ✓ with no text model but a classifier configured, uses the SMOL LLM (1 ms) 1167: ✓ classifies intent via LLM when autoDetectMethod=llm (1 ms) 1168: ✓ resets image status when LLM returns non-image intent (1 ms) 1169: ✓ returns false and resets state when classification throws (2 ms) 1170: handleImageGenerationFn 1171: ✓ shows alert when no image model loaded (1 ms) 1172: ✓ adds user message when skipUserMessage is false (default) 1173: ✓ keeps attachments (e.g. a voice note) on the user message in the image route 1174: ✓ skips user message when skipUserMessage=true (1 ms) 1175: ✓ shows alert when image generation returns null and there is a non-cancel error 1176: ✓ does not show alert when error is "cancelled" 1177: executeDeleteConversationFn ... 1214: ✓ treats an unset gate as allowed (backward compatible) 1215: ✓ regenerate also honours the UI tool gate 1216: RAG context injection in startGenerationFn 1217: ✓ injects doc list and RAG context when conversation has a projectId and search returns chunks (1 ms) 1218: ✓ injects doc list even when BM25 returns no chunks 1219: ✓ does not inject RAG context when conversation has no projectId (1 ms) 1220: ✓ does not inject doc list when all docs are disabled 1221: ✓ continues generation even if RAG search throws 1222: ✓ auto-enables search_knowledge_base tool for project conversations 1223: RAG context injection in regenerateResponseFn 1224: ✓ injects RAG context for project conversations (1 ms) 1225: ✓ skips RAG for non-project conversations 1226: embedding model warmup in injectRagContext 1227: ✓ fires embeddingService.load() when project has enabled docs and model is not loaded 1228: ✓ does not call load() when embedding model is already loaded (1 ms) 1229: ✓ does not block generation if embedding load fails 1230: ✓ does not fire warmup when no enabled docs exist 1231: handleSelectProjectFn 1232: ✓ sets conversation project when activeConversationId is set 1233: ✓ clears project when project is null (1 ms) 1234: ✓ skips setConversationProject when no activeConversationId 1235: handleSendFn — additional branches 1236: ✓ appends document attachment content to message text 1237: ✓ ignores attachments without textContent (1 ms) 1238: ✓ enqueues message when generation is already in progress 1239: ✓ prefixes message when shouldGenerateImage=true but no image model loaded 1240: startGenerationFn — remote model path 1241: ✓ skips local model loading for remote models 1242: ✓ uses all tools when remote server is active (bypasses heuristic) (1 ms) 1243: regenerateResponseFn — model not loaded 1244: ✓ returns early when local model is not loaded 1245: ✓ does not return early for remote models even if local model is not loaded 1246: generateWithCompactionRetry — context full error path 1247: ✓ rethrows non-context-full errors 1248: ✓ retries with compacted messages on context full error 1249: ✓ falls back to recent messages when compact throws (1 ms) ... 1284: ✓ shows alert when toggling without image model loaded (12 ms) 1285: ✓ cycles through auto -> force -> disabled -> auto (26 ms) 1286: ✓ quick settings button is always visible regardless of props (6 ms) 1287: vision capabilities 1288: ✓ shows attach button when supportsVision is true (5 ms) 1289: ✓ shows attach button even when supportsVision is false (5 ms) 1290: ✓ shows alert when pressing photo without vision support (29 ms) 1291: ✓ opens image picker when pressing photo with vision support (16 ms) 1292: ✓ attach button is present when vision is supported (6 ms) 1293: attachments 1294: ✓ shows custom alert when photo is pressed via attach picker (18 ms) 1295: ✓ shows attachment preview after selecting image (351 ms) 1296: ✓ can send message with attachment (355 ms) 1297: ✓ renders attach button always (5 ms) 1298: ✓ opens document picker when document is pressed via attach picker (67 ms) 1299: ✓ shows error alert for unsupported file types (68 ms) 1300: ✓ does nothing when document picker is cancelled (15 ms) 1301: ✓ shows document preview with file icon after picking document (65 ms) 1302: ✓ sends message with document attachment (72 ms) 1303: ✓ shows error alert when processDocumentFromPath fails (69 ms) 1304: ✓ handles processDocumentFromPath returning null (13 ms) ... 1333: ✓ renders and handles stop button when onStop is provided (3 ms) 1334: send with attachment but no text 1335: ✓ shows send button when only attachments are present (339 ms) 1336: disabled does not send with attachment 1337: ✓ does not call onSend when disabled even with attachments (4 ms) 1338: voice recording integration 1339: ✓ starts recording and tracks conversationId (4 ms) 1340: ✓ inserts transcribed text into message when finalResult arrives (10 ms) 1341: ✓ appends transcribed text to existing message (17 ms) 1342: ✓ clears pending transcription when conversation changes (9 ms) 1343: ✓ calls stopRecording and clearResult on cancel recording (5 ms) 1344: image mode toggle alert when no model loaded 1345: ✓ shows alert when toggling image mode without loaded model (21 ms) 1346: camera capture flow 1347: ✓ picks image from camera when Camera option is pressed (45 ms) 1348: ✓ handles camera error gracefully (28 ms) 1349: ✓ handles camera returning no assets (35 ms) 1350: photo library error 1351: ✓ handles photo library error gracefully (43 ms) 1352: document picker error without message 1353: ✓ shows fallback error message when error has no message (69 ms) 1354: voice recording without conversationId ... 1398: ✓ shows import button (52 ms) 1399: ✓ triggers file picker on import press (51 ms) 1400: recommended models 1401: ✓ RECOMMENDED_MODELS has entries (1 ms) 1402: ✓ all recommended models have minRam 1403: ✓ all recommended models have type badges (text/vision/code) (1 ms) 1404: ✓ recommended models have editorial ordering with Gemma 4 first 1405: ✓ MODEL_ORGS contains expected organizations 1406: type filter 1407: ✓ filters by text models (1 ms) 1408: ✓ filters by vision models 1409: ✓ has no code models after removal 1410: multi-file download 1411: ✓ vision model files include mmProjFile 1412: ✓ calculates combined size for vision model files (1 ms) 1413: search error handling 1414: ✓ handles search network error gracefully (46 ms) 1415: text filter bar ... 1471: ✓ shows image filter toggle on image tab (84 ms) 1472: ✓ renders device recommendation banner on image tab (91 ms) 1473: import progress 1474: ✓ shows import progress card when importing (79 ms) 1475: tab switching resets state 1476: ✓ resets text filters when switching to image tab (143 ms) 1477: model type detection 1478: ✓ detects code models from tags (640 ms) 1479: ✓ detects image-gen models from diffusion tags (599 ms) 1480: file compatibility 1481: ✓ hides models with files too large for device RAM (583 ms) 1482: ✓ shows models with no file info (files not yet fetched) (575 ms) 1483: recommended models with filters 1484: ✓ filters recommended models by type filter (126 ms) 1485: ✓ hides recommended models that are already downloaded (65 ms) 1486: search error display 1487: ✓ handles API error gracefully during search (73 ms) 1488: detail view navigation ... 1507: handleSearch with active filters 1508: ✓ triggers HuggingFace search when vision type filter is set and query is empty (83 ms) 1509: ✓ does not trigger HuggingFace search when query is empty and no filters are active (68 ms) 1510: ✓ triggers HuggingFace search with "coder" keyword when code filter is set and query is empty (139 ms) 1511: formatNumber display 1512: ✓ shows formatted download count in detail view (628 ms) 1513: PASS __tests__/integration/generation/imageGenerationFlow.test.ts 1514: Image Generation Flow Integration 1515: Image Generation Lifecycle 1516: ✓ should update state during generation lifecycle (2 ms) 1517: ✓ should call localDreamGeneratorService with correct parameters 1518: ✓ should save generated image to gallery (1 ms) 1519: ✓ should add message to chat when conversationId is provided (2 ms) 1520: Progress Updates 1521: ✓ should receive and propagate progress updates (1 ms) 1522: Error Handling 1523: ✓ should handle generation errors gracefully (1 ms) 1524: ✓ should return null when no model is selected 1525: ✓ should handle model load failure (1 ms) 1526: Cancel Generation ... 1538: Prompt Enhancement with Conversation Context 1539: ✓ should pass conversation history to enhancement when conversationId provided (5 ms) 1540: ✓ should not include conversation context when no conversationId (1 ms) 1541: ✓ should truncate long messages in conversation context (1 ms) 1542: ✓ should limit conversation context to last 10 messages (1 ms) 1543: ✓ should skip system messages from conversation context (1 ms) 1544: ✓ should use original prompt when enhancement is disabled (1 ms) 1545: ✓ should handle empty conversation gracefully (1 ms) 1546: cancelGeneration when not generating 1547: ✓ should return immediately when not generating 1548: isGeneratingFor 1549: ✓ returns false when not generating 1550: ✓ returns true when generating for matching conversation (2 ms) 1551: generation returning null result (no imagePath) 1552: ✓ should return null when native generator returns null (1 ms) 1553: prompt enhancement error handling 1554: ✓ should fall back to original prompt when enhancement fails (1 ms) 1555: ✓ should skip enhancement when LLM is not loaded 1556: enhancement result update vs delete thinking message 1557: ✓ should update thinking message when enhancement produces different prompt (1 ms) 1558: ✓ should delete thinking message when enhancement returns same prompt (1 ms) 1559: generation with conversation metadata 1560: ✓ should include correct backend metadata for QNN model 1561: cancelRequested during generation 1562: ✓ should check cancelRequested after model load (1 ms) 1563: generation without conversationId 1564: ✓ should save to gallery but not add chat message 1565: enhancement with LLM currently generating 1566: ✓ should still attempt enhancement even if LLM was generating (1 ms) 1567: prompt enhancement strips thinking model tags 1568: ✓ should strip <think> tags from thinking model responses (1 ms) 1569: ✓ should handle thinking model response that is only a think block 1570: ✓ should handle response without think tags normally (3 ms) 1571: cancelled error handling 1572: ✓ should reset state when error message includes cancelled 1573: prompt enhancement stopGeneration cleanup (lines 247, 287-291) 1574: ✓ should call stopGeneration after successful enhancement (line 247) (1 ms) 1575: ✓ should call stopGeneration even when stopGeneration itself throws (lines 253-255) 1576: ✓ should delete thinking message and call stopGeneration when enhancement fails with conversationId (lines 287-298) (1 ms) 1577: ✓ should call stopGeneration in catch when stopGeneration itself throws during error cleanup (lines 290-292) (1 ms) 1578: ✓ should update thinking message in chat when enhancement succeeds with conversationId (lines 263-278) (1 ms) 1579: ✓ should delete thinking message when enhancement returns same prompt as original (lines 274-278) (2 ms) 1580: onPreview callback normal path (lines 388-389) 1581: ✓ should update previewPath state when onPreview fires without cancellation (1 ms) 1582: onPreview callback skipped when cancelRequested (lines 387-389) 1583: ✓ should skip preview update when cancelRequested is true during preview callback (4 ms) 1584: cancelRequested check after generateImage resolves (lines 397-398) 1585: ✓ should return null when cancelRequested is set before generateImage resolves (4 ms) 1586: OpenCL kernel cache branches 1587: ✓ logs warning and sets isFirstGpuRun=false when hasKernelCache throws (1 ms) 1588: ✓ uses regular progress status when kernel cache exists (isFirstGpuRun=false) (1 ms) 1589: _ensureImageModelLoaded with null activeImageModelId 1590: ✓ returns false and sets error when activeImageModelId is null but model not loaded 1591: PASS __tests__/rntl/screens/HomeScreen.test.tsx (8.378 s) ... 1655: ✓ shows "Unload current model" when image model is active (26 ms) 1656: ✓ shows model item for active text model (27 ms) 1657: ✓ closes picker when close button pressed (32 ms) 1658: ✓ shows "Browse more models" link in picker (28 ms) 1659: ✓ navigates to ModelsTab when "Browse more models" pressed (31 ms) 1660: ✓ shows memory estimate per model in picker (28 ms) 1661: ✓ shows vision indicator for vision models in picker (30 ms) 1662: model selection from picker 1663: ✓ marks text model active without loading or checking memory (31 ms) 1664: ✓ marks image model active without loading or checking memory (34 ms) 1665: ✓ does not show a memory dialog when selecting a text model (85 ms) 1666: ✓ closes the picker after selecting a text model (31 ms) 1667: model unloading from picker 1668: ✓ unloads text model when unload button pressed in picker (44 ms) 1669: ✓ unloads image model when unload button pressed in picker (59 ms) 1670: ✓ shows error alert when text model unload fails (67 ms) 1671: ✓ shows error alert when image model unload fails (63 ms) 1672: model load error handling 1673: ✓ shows error when eject all fails (424 ms) 1674: delete conversation ... 1681: ✓ shows RAM estimates in both pickers when both models loaded (44 ms) 1682: ✓ renders without crashing when both models loaded (14 ms) 1683: delete conversation full flow 1684: ✓ renders delete button in swipeable right actions (12 ms) 1685: ✓ shows delete confirmation and deletes conversation (29 ms) 1686: ✓ cancels delete conversation (26 ms) 1687: gallery navigation 1688: ✓ navigates to Gallery when gallery card is pressed (10 ms) 1689: empty picker browse navigation 1690: ✓ navigates to ModelsTab from empty text picker Browse Models button (31 ms) 1691: ✓ navigates to ModelsTab from empty image picker Browse Models button (54 ms) 1692: formatDate coverage 1693: ✓ shows "Yesterday" for conversations updated yesterday (29 ms) 1694: ✓ shows weekday name for conversations updated 2-6 days ago (26 ms) 1695: ✓ shows month and day for conversations updated more than 7 days ago (34 ms) 1696: memory info error handling 1697: ✓ handles getResourceUsage failure gracefully (72 ms) 1698: ✓ refreshes memory info when subscribe callback fires (38 ms) ... 2228: ✓ should parse multiple SSE events 2229: ✓ should handle multi-line data (1 ms) 2230: ✓ should handle events without explicit event type 2231: ✓ should throw when body is not readable (21 ms) 2232: ✓ should handle events with id field (1 ms) 2233: ✓ should handle data as object type 2234: ✓ should handle chunked data correctly 2235: ✓ should handle event with id field (1 ms) 2236: ✓ should throw when response body is not readable (4 ms) 2237: ✓ should handle events with only data field 2238: ✓ should skip events without data (1 ms) 2239: ✓ should yield remaining event at end of stream 2240: parseOpenAIMessage 2241: ✓ should parse content delta 2242: ✓ should parse [DONE] marker 2243: ✓ should parse error messages 2244: ✓ should parse tool calls ... 2250: ✓ should return null for empty data 2251: isPrivateNetworkEndpoint 2252: ✓ should detect localhost as private 2253: ✓ should detect 192.168.x.x as private (1 ms) 2254: ✓ should detect 10.x.x.x as private 2255: ✓ should detect 172.16-31.x.x as private 2256: ✓ should NOT detect 172.15.x.x as private 2257: ✓ should NOT detect 172.32.x.x as private 2258: ✓ should detect link-local 169.254.x.x as private 2259: ✓ should detect .local (mDNS) as private 2260: ✓ should detect public internet as NOT private 2261: ✓ should handle invalid URLs (1 ms) 2262: fetchWithTimeout 2263: ✓ should resolve with JSON response 2264: ✓ should resolve with text response for non-JSON 2265: ✓ should throw on HTTP error (8 ms) 2266: ✓ should timeout after specified duration (1 ms) 2267: ✓ should retry on transient errors (3 ms) 2268: ✓ should throw "Request cancelled" on AbortError 2269: ✓ should fallback to text when content-type header is missing 2270: ✓ should fallback to "Unknown error" when response.text() fails 2271: ✓ should handle non-Error thrown values 2272: testEndpoint 2273: ✓ should return success for reachable endpoint 2274: ✓ should return error for unreachable endpoint 2275: ✓ should return error on HTTP error (1 ms) 2276: ✓ should try alternate health endpoints when /v1/models fails 2277: ✓ should strip trailing slashes from endpoint 2278: imageToBase64DataUrl 2279: ✓ should return data URL as-is if already encoded 2280: ✓ should encode file:// URI to base64 2281: ✓ should throw if file does not exist (4 ms) 2282: ✓ should determine MIME type from extension 2283: ✓ should default to jpeg for unknown extensions 2284: ✓ should handle paths without file:// prefix 2285: ✓ should fetch and encode remote URLs (5 ms) 2286: ✓ should throw on fetch failure 2287: ✓ should throw on FileReader error (3 ms) 2288: detectServerType 2289: ✓ should detect Ollama from server header 2290: ✓ should detect Ollama from /api/tags endpoint 2291: ✓ should detect LM Studio from model list 2292: ✓ should detect generic OpenAI-compatible server (1 ms) 2293: ✓ should return null when server type cannot be determined 2294: ✓ should return null on network error 2295: ✓ should strip trailing slashes from endpoint 2296: ✓ should fallback to Ollama when OpenAI-compatible check fails 2297: createStreamingRequest 2298: ✓ should make POST request with correct headers (1 ms) 2299: ✓ should parse SSE events on progress (1 ms) 2300: ✓ should resolve on successful completion 2301: ✓ should reject on HTTP error (1 ms) 2302: ✓ should reject on network error 2303: ✓ should reject on timeout (4 ms) 2304: ✓ should handle events with event type 2305: ✓ should handle events with id field (1 ms) 2306: ✓ should handle multi-line data 2307: ✓ should process final chunk on completion 2308: ✓ should handle incremental progress updates (1 ms) 2309: ✓ should handle events with id in final chunk 2310: ✓ should handle multi-line data in final chunk 2311: ✓ should handle events with event type in final chunk (1 ms) 2312: ✓ should handle XHR timeout event 2313: ✓ should handle XHR timeout via ontimeout 2314: ✓ should reject on send error 2315: ✓ should abort XHR when signal fires (1 ms) 2316: ✓ should not process final data when responseText equals processed length 2317: detectServerType — additional branches 2318: ✓ returns null when JSON parse throws for /v1/models response 2319: ✓ returns null when LM Studio response has no gguf models 2320: ✓ handles generic OpenAI-compatible via Array.isArray(data.data) branch 2321: parseAnthropicMessage — non-string data 2322: ✓ returns null for non-string data (1 ms) 2323: ✓ returns null for invalid JSON 2324: createNDJSONStreamingRequest 2325: ✓ resolves and calls onLine for each complete NDJSON line 2326: ✓ flushes partial buffered line on readyState=4 2327: ✓ rejects on HTTP error status (1 ms) 2328: ✓ rejects on network error 2329: ✓ rejects on timeout ... 2331: ✓ warns and skips invalid JSON lines 2332: ✓ sets custom headers (1 ms) 2333: ✓ processes onprogress chunks and merges partial lines 2334: ✓ warns and skips invalid JSON in buffered final line 2335: ✓ rejects when xhr.send throws (1 ms) 2336: PASS __tests__/unit/services/parallelMmproj.test.ts 2337: Parallel mmproj download 2338: performBackgroundDownload 2339: ✓ starts both main and mmproj downloads in parallel (1 ms) 2340: ✓ persists mmProjDownloadId in metadata callback 2341: ✓ sets mmProjCompleted=false and mainCompleted=false in context 2342: ✓ skips mmproj download when mmproj already exists (1 ms) 2343: ✓ only starts main download for non-vision models 2344: ✓ returns immediately when both files already exist 2345: ✓ re-downloads mmproj when an existing sidecar is only partially written (1 ms) 2346: ✓ re-downloads mmproj when stat fails for an existing sidecar 2347: ✓ reuses an existing failed entry by cancelling old downloads and retrying the store entry (1 ms) 2348: combined progress 2349: ✓ reports combined progress from both downloads (1 ms) 2350: ✓ includes pre-existing mmproj size in progress when mmproj already downloaded 2351: ✓ updates the native combined-progress notification when supported (1 ms) 2352: ✓ swallows native combined-progress update failures 2353: watchBackgroundDownload — completion gating 2354: ✓ does not fire onComplete until both downloads finish (mmproj first) (1 ms) 2355: ✓ does not fire onComplete until both downloads finish (main first) 2356: ✓ fires onComplete immediately for non-vision model (no mmproj) 2357: ✓ moves mmproj file on mmproj completion (1 ms) 2358: ✓ clears metadata callback when both complete 2359: ✓ ignores duplicate main completion events after the first one 2360: ✓ drops vision when mmproj move fails and the target file is missing 2361: watchBackgroundDownload — error handling 2362: ✓ cancels mmproj when main download fails (1 ms) 2363: ✓ preserves retry context and resets main finalization flags when main download fails 2364: ✓ continues as text-only when mmproj download fails (4 ms) 2365: watchBackgroundDownload — already-downloaded recovery 2366: ✓ persists already-downloaded models before firing onComplete (1 ms) 2367: ✓ still fires onComplete when persistence fails for already-downloaded models 2368: ✓ surfaces an already-downloaded context error via onError (1 ms) 2369: syncCompletedBackgroundDownloads 2370: ✓ syncs completed model with mmproj download 2371: ✓ skips sync when mmproj download is still running 2372: ✓ cancels mmproj when main download failed 2373: restoreInProgressDownloads — mmproj recovery 2374: ✓ restores both main and mmproj progress listeners (1 ms) 2375: ✓ handles mmproj completed while app was dead (1 ms) 2376: ✓ marks mmproj as completed when it failed while app was dead 2377: ✓ defers mmproj move to watchBackgroundDownload when file not yet on disk 2378: ✓ does not create duplicate context for mmproj download ID 2379: watchBackgroundDownload — catch-up paths 2380: ✓ finalizes after mmproj was already completed before listener registration (1 ms) 2381: ✓ continues without vision when catch-up mmproj move fails and target is missing (1 ms) 2382: PASS __tests__/rntl/screens/ModelSettingsScreen.test.tsx ... 2596: ✓ "draw an elephant" should classify as image 2597: ✓ "draw the sunset" should classify as image 2598: ✓ "paint a landscape" should classify as image 2599: ✓ "paint me a portrait" should classify as image 2600: ✓ "paint an abstract piece" should classify as image 2601: ✓ "sketch a building" should classify as image (1 ms) 2602: ✓ "sketch me a character" should classify as image 2603: ✓ "sketch the mountain" should classify as image 2604: Text Intent Patterns 2605: Questions and explanations 2606: ✓ "explain how photosynthesis works" should classify as text 2607: ✓ "tell me about the French Revolution" should classify as text 2608: ✓ "describe the water cycle" should classify as text 2609: ✓ "what is machine learning" should classify as text (1 ms) 2610: ✓ "what are the benefits of exercise" should classify as text 2611: ✓ "what does this error mean" should classify as text 2612: ✓ "what's the capital of France" should classify as text 2613: ✓ "whats happening in the code" should classify as text 2614: How questions 2615: ✓ "how do I install node.js" should classify as text 2616: ✓ "how does electricity work" should classify as text 2617: ✓ "how to make pasta" should classify as text 2618: ✓ "how can I improve my writing" should classify as text 2619: ✓ "how would you solve this problem" should classify as text (1 ms) 2620: ✓ "how should I structure my code" should classify as text 2621: Why questions 2622: ✓ "why is the sky blue" should classify as text 2623: ✓ "why does water boil" should classify as text 2624: ✓ "why do birds migrate" should classify as text 2625: ✓ "why are leaves green" should classify as text 2626: ✓ "why would this fail" should classify as text 2627: When/Where/Who/Which questions ... 2674: ✓ "draft a script for a video" should classify as text 2675: ✓ "write an article about technology" should classify as text (2 ms) 2676: ✓ "compose a post for social media" should classify as text 2677: ✓ "write a message to the team" should classify as text 2678: ✓ "draft a response to this email" should classify as text (1 ms) 2679: Programming and code 2680: ✓ "write code to sort an array" should classify as text 2681: ✓ "create a function to validate email" should classify as text 2682: ✓ "write a script to automate backups" should classify as text 2683: ✓ "create a program to parse CSV" should classify as text 2684: ✓ "write a sql query to get users" should classify as text 2685: ✓ "create a regex for phone numbers" should classify as text 2686: ✓ "code a simple calculator" should classify as text 2687: ✓ "coding challenge solution" should classify as text (1 ms) 2688: ✓ "programming in python" should classify as text 2689: ✓ "debug this error" should classify as text 2690: ✓ "debugging the crash" should classify as text 2691: ✓ "fix the code that throws an error" should classify as text 2692: ✓ "debug this bug in my app" should classify as text ... 2698: ✓ "variable not defined" should classify as text 2699: ✓ "array out of bounds" should classify as text (1 ms) 2700: ✓ "object is null" should classify as text 2701: ✓ "loop through items" should classify as text 2702: ✓ "if statement not working" should classify as text 2703: ✓ "javascript async await" should classify as text 2704: ✓ "typescript interface" should classify as text 2705: ✓ "python list comprehension" should classify as text (1 ms) 2706: ✓ "java hashmap" should classify as text 2707: ✓ "kotlin coroutines" should classify as text (7 ms) 2708: ✓ "swift optionals" should classify as text (1 ms) 2709: ✓ "c++ pointers" should classify as text (1 ms) 2710: ✓ "rust ownership" should classify as text 2711: ✓ "go goroutines" should classify as text (1 ms) 2712: ✓ "ruby blocks" should classify as text 2713: ✓ "import statement error" should classify as text 2714: ✓ "export default component" should classify as text 2715: ✓ "return value is undefined" should classify as text (1 ms) 2716: ✓ "const vs let in javascript" should classify as text 2717: ✓ "def function python" should classify as text 2718: ✓ "fn main rust" should classify as text 2719: ✓ "error: cannot find module" should classify as text 2720: ✓ "TypeError: undefined is not a function" should classify as text 2721: ✓ "exception thrown at line 42" should classify as text ... 2833: ✓ picture in context of describing should be text 2834: ✓ image in technical context should be text 2835: ✓ render in code context should be text 2836: Empty and edge case inputs 2837: ✓ empty string should return text 2838: ✓ whitespace only should return text 2839: ✓ single word with no clear intent should return text 2840: Case insensitivity 2841: ✓ UPPERCASE should still match patterns 2842: ✓ MixedCase should still match patterns 2843: Whitespace handling 2844: ✓ leading/trailing whitespace should be trimmed (5 ms) 2845: Cache Behavior 2846: ✓ should return cached result on repeat query 2847: ✓ clearCache should reset the cache 2848: ✓ should handle very long messages without errors 2849: quickCheck 2850: ✓ should return image for image patterns (1 ms) 2851: ✓ should return text for text patterns 2852: ✓ should return text for uncertain messages 2853: ✓ should be synchronous 2854: LLM Fallback 2855: ✓ should not call LLM when useLLM is false 2856: ✓ should return text default when pattern is uncertain and LLM disabled 2857: ✓ should throw when LLM enabled but no model loaded (1 ms) 2858: ✓ should use LLM classification when pattern is uncertain and LLM enabled 2859: ✓ should return text when LLM responds NO (1 ms) 2860: ✓ should handle LLM errors gracefully 2861: Cache Eviction ... 2878: ✓ returns empty array for plain conversational messages (1 ms) 2879: PASS __tests__/unit/services/backgroundDownloadService.test.ts 2880: BackgroundDownloadService 2881: isAvailable 2882: ✓ returns true on Android with native module present (3 ms) 2883: ✓ returns true on iOS when native module is present (2 ms) 2884: ✓ returns false when native module is null (4 ms) 2885: startDownload 2886: ✓ calls native module with correct params (3 ms) 2887: ✓ returns pending status (2 ms) 2888: ✓ defaults modelType to text and totalBytes to 0 when not provided (2 ms) 2889: ✓ throws when not available (37 ms) 2890: cancelDownload 2891: ✓ delegates to native module (3 ms) 2892: ✓ throws when not available (4 ms) 2893: ✓ notifies error listeners with a user_cancelled event so awaiters can settle (2 ms) 2894: ✓ rejects a downloadFileTo() promise as cancelled when its download is cancelled (3 ms) 2895: ✓ synthesizes the cancellation even if the native cancel throws (5 ms) 2896: getActiveDownloads 2897: ✓ returns empty array when not available (3 ms) 2898: ✓ maps native response to BackgroundDownloadInfo (3 ms) 2899: moveCompletedDownload 2900: ✓ delegates to native module (2 ms) 2901: ✓ throws when not available (9 ms) 2902: listener registration 2903: ✓ onProgress registers and returns unsubscribe function (2 ms) 2904: ✓ onComplete registers and returns unsubscribe function (2 ms) 2905: ✓ onError registers and returns unsubscribe function (2 ms) 2906: ✓ onAnyProgress registers global listener (2 ms) 2907: ✓ onAnyComplete registers global listener (2 ms) 2908: ✓ onAnyError registers global listener (1 ms) 2909: event dispatching 2910: ✓ dispatches progress to both specific and global listeners (2 ms) 2911: ✓ dispatches progress to global listener when no per-download listene...

coderabbitai

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/components/ChatInput/index.tsx`:
- Around line 107-110: The attachment removal in onSummarizeAttachment is
unconditional, so update the summarize flow to only call removeAttachment after
a confirmed successful handleSummarize() result. Adjust
useSummarizeAttachment/handleSummarize to return an explicit success flag or
rethrow on failure, then have ChatInput decide removal based on that outcome so
failed or skipped summaries do not drop the original attachment.

In `@src/components/ChatInput/styles.ts`:
- Around line 58-69: The shared document filename style in documentName still
includes spacing from the old stacked layout, which now misaligns the text when
rendered inside documentNameRow. Split the style into separate variants in
styles.ts: keep the centered/stacked spacing only for the audio-only or vertical
layout, and remove the top margin from the inline filename style used with the
file icon. Update the relevant consumer(s) to use the appropriate style variant
so documentNameRow and the filename stay vertically aligned.

In `@src/services/generationToolLoop.ts`:
- Around line 519-537: The exact-time note is being regenerated on every
tool-loop pass in augmentSystemPromptForTools, which can change the latest user
message and move the cache boundary. Precompute the exact-time note once per
turn and thread it through the generation loop instead of calling
buildExactTimeNote() inside the prompt augmentation path; use the existing
augmentSystemPromptForTools flow and the exactTimeAppended logic to keep the
appended note stable across iterations.

In `@src/services/llm.ts`:
- Around line 296-308: `generateWithMaxTokens` only appends streamed
`data.token` values, so final completion payloads can be dropped for
tool-selection/summarization. Update the completion callback in
`LLMService.generateWithMaxTokens` to also read the final result fields already
used elsewhere, such as `completionResult.content` or `completionResult.text`,
and append that fallback when no token deltas are streamed. Keep the change
localized to the `safeCompletion`/`ctx.completion` flow so the return value
still comes from `fullResponse.trim()`.

In `@src/services/rag/database.ts`:
- Around line 22-23: `metadata` is being exposed as a serialized JSON string on
the read side, which breaks symmetry with `Chunk.metadata` and forces callers to
parse it themselves. Update the read-side types in the database service to keep
`metadata` structured like `Chunk.metadata`, and parse the stored JSON once
inside `getEmbeddingsByProject()` and `getChunksByProject()` before returning
results so both the main and fallback paths use the same object shape.
- Around line 65-70: The migration in ensureReady() is swallowing all errors
around the rag_chunks metadata ALTER TABLE step; update the try/catch so
Database.ensureReady() only ignores the expected duplicate-column case and
rethrows any other failure. Use the existing rag_chunks metadata migration block
to detect that specific error from this.db.executeSync('ALTER TABLE rag_chunks
ADD COLUMN metadata TEXT') and let unexpected migration errors propagate so
callers don’t proceed with a broken schema.

In `@src/services/transcriptSummarizer.ts`:
- Around line 114-122: The combine logic in transcriptSummarizer’s budget
calculations is reusing the chunk budget for the final pass, which can let
inputs through that still exceed the combine reservation. Add a separate
final-combine input budget in the same budgeting section near
ctxLength/inputBudgetTokens/chunkCharBudget, and use the FINAL_SUMMARY_TOKENS
reservation when deciding whether the merged partials are small enough for the
last combine. Update the reduce/combine threshold checks in the summarization
flow so the final pass uses its own budget instead of chunkCharBudget.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 3407e1dd-1d10-41a0-bc4d-7319e1642cda

📥 Commits

Reviewing files that changed from the base of the PR and between e65db82 and 291c0ce.

📒 Files selected for processing (15)

src/components/ChatInput/Attachments.tsx
src/components/ChatInput/index.tsx
src/components/ChatInput/styles.ts
src/components/ChatInput/useSummarizeAttachment.ts
src/services/chatAttachmentInbox.ts
src/services/generationToolLoop.ts
src/services/index.ts
src/services/llm.ts
src/services/rag/chunking.ts
src/services/rag/database.ts
src/services/rag/index.ts
src/services/rag/retrieval.ts
src/services/transcriptSummarizer.ts
src/stores/projectStore.ts
src/types/index.ts

coderabbitai · 2026-06-30T03:17:46Z

+  const { summarizingId, handleSummarize } = useSummarizeAttachment();
+  const onSummarizeAttachment = async (attachment: MediaAttachment) => {
+    await handleSummarize(attachment);
+    removeAttachment(attachment.id);


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Only remove the attachment after a successful summary.

handleSummarize() catches its own failures and also returns early on some no-op paths, so this unconditional removeAttachment() drops the original attachment even when summarization failed or never started.

Suggested fix

const { summarizingId, handleSummarize } = useSummarizeAttachment(); const onSummarizeAttachment = async (attachment: MediaAttachment) => { - await handleSummarize(attachment); - removeAttachment(attachment.id); + const summarized = await handleSummarize(attachment); + if (summarized) removeAttachment(attachment.id); };

handleSummarize() should return a success flag (or rethrow on failure) so the caller can make this decision correctly.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/ChatInput/index.tsx` around lines 107 - 110, The attachment removal in onSummarizeAttachment is unconditional, so update the summarize flow to only call removeAttachment after a confirmed successful handleSummarize() result. Adjust useSummarizeAttachment/handleSummarize to return an explicit success flag or rethrow on failure, then have ChatInput decide removal based on that outcome so failed or skipped summaries do not drop the original attachment.

coderabbitai · 2026-06-30T03:17:47Z

+  documentNameRow: {
+    flexDirection: 'row' as const,
+    alignItems: 'center' as const,
+    gap: 6,
+  },
  documentName: {
    fontSize: 10,
    fontFamily: FONTS.mono,
    color: colors.textMuted,
    textAlign: 'center' as const,
    marginTop: 4,
  },


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Split the inline filename style from the stacked one.

documentName still carries marginTop: 4 from the old centered layout, but it now also renders inside documentNameRow. That offsets the filename downward relative to the file icon. A separate inline text style, or moving the top margin to the audio-only variant, will avoid the misalignment.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/ChatInput/styles.ts` around lines 58 - 69, The shared document filename style in documentName still includes spacing from the old stacked layout, which now misaligns the text when rendered inside documentNameRow. Split the style into separate variants in styles.ts: keep the centered/stacked spacing only for the audio-only or vertical layout, and remove the top margin from the inline filename style used with the file icon. Update the relevant consumer(s) to use the appropriate style variant so documentNameRow and the filename stay vertically aligned.

coderabbitai · 2026-06-30T03:17:47Z

+  // System prompt gets only the STABLE date (changes once a day) + tool guidance, so the
+  // system+tools prefix stays cacheable turn-to-turn.
+  const updatedSys = { ...sys, content: existing + TOOL_BEHAVIOR_GUIDANCE + buildDateContext() + extHints };
+  const out = [...messages.slice(0, sysIdx), updatedSys, ...messages.slice(sysIdx + 1)];
+
+  // For time-sensitive (calendar) tools, append the EXACT time to the latest user
+  // message instead of the system prefix — keeps the big prefix cacheable while still
+  // giving the model sub-day precision.
  const precise = enabledToolIds.some(id => TIME_SENSITIVE_TOOL_IDS.includes(id));
-  const updated = { ...sys, content: existing + TOOL_BEHAVIOR_GUIDANCE + buildDateTimeContext(precise) + extHints };
-  return [...messages.slice(0, sysIdx), updated, ...messages.slice(sysIdx + 1)];
+  let exactTimeAppended = false;
+  if (precise) {
+    for (let i = out.length - 1; i >= 0; i--) {
+      if (out[i].role === 'user' && typeof out[i].content === 'string') {
+        out[i] = { ...out[i], content: (out[i].content as string) + buildExactTimeNote() };
+        exactTimeAppended = true;
+        break;
+      }
+    }
+  }


🚀 Performance & Scalability | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Confirm whether augmentSystemPromptForTools (via callLLMWithRetry) runs once per # tool-loop iteration, which would re-append a fresh timestamp each iteration. rg -nP -C4 '\bcallLLMWithRetry\s*\(' src/services/generationToolLoop.ts echo '--- loop structure around callLLMWithRetry ---' ast-grep run --pattern 'callLLMWithRetry($$$)' --lang ts src/services/generationToolLoop.ts

Repository: off-grid-ai/mobile

Length of output: 3076

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect the tool-loop and prompt augmentation flow around the cited lines. sed -n '500,820p' src/services/generationToolLoop.ts echo '--- search for augmentSystemPromptForTools usage/definition ---' rg -n -C 4 'augmentSystemPromptForTools|buildExactTimeNote|buildDateContext|TIME_SENSITIVE_TOOL_IDS' src/services/generationToolLoop.ts

Repository: off-grid-ai/mobile

Length of output: 19829

Cache the exact-time note once per turn. buildExactTimeNote() runs on every tool-loop pass, so the latest user message can change between iterations and shift the cache boundary into the user/tool-result tail. Thread one precomputed note through the loop instead of regenerating it per augmentSystemPromptForTools() call.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/generationToolLoop.ts` around lines 519 - 537, The exact-time note is being regenerated on every tool-loop pass in augmentSystemPromptForTools, which can change the latest user message and move the cache boundary. Precompute the exact-time note once per turn and thread it through the generation loop instead of calling buildExactTimeNote() inside the prompt augmentation path; use the existing augmentSystemPromptForTools flow and the exactTimeAppended logic to keep the appended note stable across iterations.

coderabbitai · 2026-06-30T03:17:47Z

    let fullResponse = '';
    const ctx = this.context;
+    // These internal generations (summarize, tool-selection) never want the
+    // model to "think" - reasoning wastes the token budget, is slow + hot, and
+    // leaks into the output. Force thinking OFF (for models that gate it via the
+    // thinking channel; prose chain-of-thought is additionally curbed by prompts).
+    const params = { messages: oaiMessages, ...buildCompletionParams(settings, { disableCtxShift: this.shouldDisableCtxShift() }), ...buildThinkingCompletionParams(false, this.isGemma4Model()), n_predict: maxTokens };
    const completionWork = safeCompletion(ctx, () => ctx.completion(
-      { messages: oaiMessages, ...buildCompletionParams(settings, { disableCtxShift: this.shouldDisableCtxShift() }), n_predict: maxTokens },
-      (data) => { if (this.isGenerating && data.token) fullResponse += data.token; },
+      params,
+      (data) => { if (this.isGenerating && data.token) { fullResponse += data.token; onToken?.(data.token); } },
    ), 'generateWithMaxTokens');
    this.activeCompletionPromise = completionWork.then(() => { }, () => { });
    try { await completionWork; return fullResponse.trim(); } finally { this.isGenerating = false; this.activeCompletionPromise = null; }


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail git ls-files src/services/llm.ts wc -l src/services/llm.ts sed -n '1,220p' src/services/llm.ts printf '\n--- MIDDLE ---\n' sed -n '220,380p' src/services/llm.ts printf '\n--- END ---\n' sed -n '380,520p' src/services/llm.ts

Repository: off-grid-ai/mobile

Length of output: 27561

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "generateResponse|generateWithMaxTokens|completion\\(" src/services/llm.ts

Repository: off-grid-ai/mobile

Length of output: 934

🏁 Script executed:

#!/bin/bash set -euo pipefail wc -l src/services/llmHelpers.ts rg -n "safeCompletion|getStreamingDelta|completion\\(|data\\.content|data\\.text|reasoning_content|reasoningContent" src/services/llmHelpers.ts src/services -g '!**/*.map'

Repository: off-grid-ai/mobile

Length of output: 7939

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '1,260p' src/services/llmHelpers.ts printf '\n--- TYPES/USAGES ---\n' rg -n "type StreamToken|interface .*Completion|completion\\(" src -g '!**/*.map'

Repository: off-grid-ai/mobile

Length of output: 13571

🏁 Script executed:

#!/bin/bash set -euo pipefail wc -l src/services/providers/types.ts src/services/llmToolGeneration.ts sed -n '1,200p' src/services/providers/types.ts printf '\n--- TOOL GENERATION ---\n' sed -n '1,240p' src/services/llmToolGeneration.ts

Repository: off-grid-ai/mobile

Length of output: 12691

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "content:|text:|reasoningContent|reasoning_content|CompletionResult|onToken" src/services/providers/types.ts src/services/llmToolGeneration.ts src/services/llmSafetyChecks.ts

Repository: off-grid-ai/mobile

Length of output: 984

Handle final completion payloads in generateWithMaxTokens

generateWithMaxTokens() only accumulates data.token, but this codebase already treats completionResult.content / completionResult.text as valid final output in similar paths. Add the same fallback here, or tool-selection/summarization can return an empty string when the backend doesn't stream token deltas.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/llm.ts` around lines 296 - 308, `generateWithMaxTokens` only appends streamed `data.token` values, so final completion payloads can be dropped for tool-selection/summarization. Update the completion callback in `LLMService.generateWithMaxTokens` to also read the final result fields already used elsewhere, such as `completionResult.content` or `completionResult.text`, and append that fallback when no token deltas are streamed. Keep the change localized to the `safeCompletion`/`ctx.completion` flow so the return value still comes from `fullResponse.trim()`.

coderabbitai · 2026-06-30T03:17:47Z

+  // JSON string of per-chunk metadata (recordingId, startMs, eventTitle, ...) or null.
+  metadata?: string | null;


🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Keep metadata structured across the DB boundary.

Chunk.metadata is written as an object, but these read-side types now expose the serialized JSON string. That pushes JSON.parse onto every caller and makes the new recording metadata easy to misuse on both the semantic path and the fallback path. Parsing once in getEmbeddingsByProject() / getChunksByProject() and typing these fields like Chunk.metadata will keep the contract symmetric.

Also applies to: 33-33

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/rag/database.ts` around lines 22 - 23, `metadata` is being exposed as a serialized JSON string on the read side, which breaks symmetry with `Chunk.metadata` and forces callers to parse it themselves. Update the read-side types in the database service to keep `metadata` structured like `Chunk.metadata`, and parse the stored JSON once inside `getEmbeddingsByProject()` and `getChunksByProject()` before returning results so both the main and fallback paths use the same object shape.

coderabbitai · 2026-06-30T03:17:47Z

+      // Older installs created rag_chunks without the metadata column; add it.
+      // Throws "duplicate column" on DBs that already have it - safe to ignore.
+      try {
+        this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT');
+      } catch {
+        // column already exists


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Only suppress the duplicate-column migration error.

The bare catch {} also hides real migration failures, so ensureReady() can succeed even though later reads/writes against rag_chunks.metadata will fail. Please rethrow anything except the expected duplicate-column case.

Suggested fix

// Older installs created rag_chunks without the metadata column; add it. // Throws "duplicate column" on DBs that already have it - safe to ignore. try { this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT'); - } catch { - // column already exists + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + if (!message.toLowerCase().includes('duplicate column')) { + throw error; + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Older installs created rag_chunks without the metadata column; add it.

// Throws "duplicate column" on DBs that already have it - safe to ignore.

try {

this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT');

} catch {

// column already exists

// Older installs created rag_chunks without the metadata column; add it.

// Throws "duplicate column" on DBs that already have it - safe to ignore.

try {

this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT');

} catch (error) {

const message = error instanceof Error ? error.message : String(error);

if (!message.toLowerCase().includes('duplicate column')) {

throw error;

}

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/rag/database.ts` around lines 65 - 70, The migration in ensureReady() is swallowing all errors around the rag_chunks metadata ALTER TABLE step; update the try/catch so Database.ensureReady() only ignores the expected duplicate-column case and rethrows any other failure. Use the existing rag_chunks metadata migration block to detect that specific error from this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT') and let unexpected migration errors propagate so callers don’t proceed with a broken schema.

coderabbitai · 2026-06-30T03:17:47Z

+      const ctxLength = llmService.getPerformanceSettings().contextLength || 2048;
+      const inputBudgetTokens = Math.max(
+        256,
+        ctxLength - CHUNK_SUMMARY_TOKENS - INSTRUCTION_OVERHEAD_TOKENS - SAFETY_MARGIN_TOKENS,
+      );
+      const chunkCharBudget = inputBudgetTokens * CHARS_PER_TOKEN;
+      // Map split is capped smaller than the full budget so each part prefills
+      // fast and streams sooner; reduce/combine still use the full chunkCharBudget.
+      const mapCharBudget = Math.min(chunkCharBudget, MAP_INPUT_TOKEN_TARGET * CHARS_PER_TOKEN);


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Use a separate input budget for the final combine pass.

chunkCharBudget is computed with CHUNK_SUMMARY_TOKENS and then reused to decide when the merged partials are “small enough” for the final combine, but the final combine reserves FINAL_SUMMARY_TOKENS. Inputs between those two budgets skip another reduce round and can still overflow the last pass.

Suggested fix

- const inputBudgetTokens = Math.max( + const reduceInputBudgetTokens = Math.max( 256, ctxLength - CHUNK_SUMMARY_TOKENS - INSTRUCTION_OVERHEAD_TOKENS - SAFETY_MARGIN_TOKENS, ); - const chunkCharBudget = inputBudgetTokens * CHARS_PER_TOKEN; + const finalInputBudgetTokens = Math.max( + 256, + ctxLength - FINAL_SUMMARY_TOKENS - INSTRUCTION_OVERHEAD_TOKENS - SAFETY_MARGIN_TOKENS, + ); + const reduceCharBudget = reduceInputBudgetTokens * CHARS_PER_TOKEN; + const finalCharBudget = finalInputBudgetTokens * CHARS_PER_TOKEN; @@ - const mapCharBudget = Math.min(chunkCharBudget, MAP_INPUT_TOKEN_TARGET * CHARS_PER_TOKEN); + const mapCharBudget = Math.min(reduceCharBudget, MAP_INPUT_TOKEN_TARGET * CHARS_PER_TOKEN); @@ - while (combined.length > chunkCharBudget && round < MAX_REDUCE_ROUNDS) { + while (combined.length > finalCharBudget && round < MAX_REDUCE_ROUNDS) { round += 1; this.emit({ phase: 'reducing', round }, onProgress); - const reChunks = splitIntoChunks(combined, chunkCharBudget); + const reChunks = splitIntoChunks(combined, reduceCharBudget);

Also applies to: 150-165

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/transcriptSummarizer.ts` around lines 114 - 122, The combine logic in transcriptSummarizer’s budget calculations is reusing the chunk budget for the final pass, which can let inputs through that still exceed the combine reservation. Add a separate final-combine input budget in the same budgeting section near ctxLength/inputBudgetTokens/chunkCharBudget, and use the FINAL_SUMMARY_TOKENS reservation when deciding whether the merged partials are small enough for the last combine. Update the reduce/combine threshold checks in the summarization flow so the final pass uses its own budget instead of chunkCharBudget.

dishit-wednesday and others added 5 commits June 30, 2026 03:31

qodo-code-review Bot reviewed Jun 30, 2026

View reviewed changes

dishit-wednesday changed the title ~~feat(chat): summarize attachments, recordings knowledge base, TTFT fix~~ feat(chat): summarize attachments + recordings knowledge base Jun 30, 2026

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

		// JSON string of per-chunk metadata (recordingId, startMs, eventTitle, ...) or null.
		metadata?: string \| null;

Uh oh!

Uh oh!

Conversation

dishit-wednesday commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this adds

How the summarizer fits together

RAG recordings knowledge base

CI

Uh oh!

coderabbitai Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

qodo-code-review Bot commented Jun 30, 2026

PR Summary by Qodo

Uh oh!

qodo-code-review Bot commented Jun 30, 2026

Code Review by Qodo

Uh oh!

qodo-code-review Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot commented Jun 30, 2026

CI Feedback 🧐

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dishit-wednesday commented Jun 30, 2026 •

edited

Loading

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading