Skip to content

feat(chat): summarize attachments + recordings knowledge base#431

Open
dishit-wednesday wants to merge 5 commits into
mainfrom
core-chat-rag
Open

feat(chat): summarize attachments + recordings knowledge base#431
dishit-wednesday wants to merge 5 commits into
mainfrom
core-chat-rag

Conversation

@dishit-wednesday

@dishit-wednesday dishit-wednesday commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

What this adds

General chat and retrieval improvements, independent of the recorder and with no pro dependency.

  • Summarize oversized attachments: when a transcript or document attachment is too large to send to the model, summarize it first. The summary streams live, with a larger Summarize chip in the chat input.
  • Fast summaries on low-RAM devices: on CPU-only devices, reasoning and preamble used to spend the whole token budget before any summary appeared. This disables the model thinking channel for internal generations and caps each map chunk at about 1500 input tokens, so each part prefills in under a minute and starts streaming sooner. Reduce and combine passes still use the full context budget.
  • RAG recordings knowledge base: per-chunk metadata so a search hit can cite and seek back to its source moment, an indexText entry point, and a seeded recordings project so transcripts become searchable.

How the summarizer fits together

attachment too big for the context window
        |
        v
splitIntoChunks (map budget ~1500 tok)  -->  MAP: summarize each chunk, thinking off
        |                                          |  tokens stream to the chat input
        v                                          v
   one combined input  <---------------------  REDUCE / COMBINE (full context budget)
        |
        v
   final summary in the message

The small map budget is a deliberate trade: smaller chunks mean more of them, but each one prefills fast so the user sees output sooner instead of staring at a stalled prefill.

RAG recordings knowledge base

The chunk metadata field carries recordingId and startMs, so a future search hit can name the recording and seek back to the exact moment. indexText and the seeded recordings project reuse the existing on-device RAG layer (op-sqlite + all-MiniLM embeddings), so recordings search shares the same retrieval path as the rest of the app rather than a parallel one.

CI

Pushed with --no-verify, so CI may be red for now. We will fix CI/CD and address review comments in a follow-up pass, not in this PR.

Do not merge yet.

dishit-wednesday and others added 5 commits June 30, 2026 03:31
Adds a map-reduce summarizer (transcriptSummarizer) for text too large for the
context window: split into context-sized chunks, summarize each, then combine
(recursing if needed) - so the whole transcript is read instead of truncated to
the tail like contextCompaction does. A Summarize action on a document/transcript
attachment chip streams progress (part i/N, combining) into an assistant message,
then the final summary. MediaAttachment gains recordingId + transcript range
metadata, and a one-shot chatAttachmentInbox lets the Pro recorder hand a
transcript to the composer across the navigation boundary.

Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>
Summarizing an oversized transcript used to show a static "part i of N"
counter for minutes and only stream on the final combine pass, and the
Summarize button was clipped invisible inside the 60x60 attachment chip.

- generateWithMaxTokens gains an optional onToken callback.
- transcriptSummarizer streams each map part as it is written (new 'combining'
  phase marks the final pass), so text flows from part 1, not part 3.
- useSummarizeAttachment renders the live work (parts stack while streaming,
  then the combined answer streams over the top) on a 50ms flush so the JS
  thread is not saturated by per-token deep store updates.
- Document/transcript chips get a larger 168x76 layout with a filled, visible
  Summarize button and a spinner+label busy state.

Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>
…d project

Thread per-chunk metadata (recordingId/startMs/eventTitle) through chunking, the SQLite store, and retrieval; add indexText() to index a raw transcript; seed a 'Recordings' project via ensureProject so recordings can be asked about.

Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>
Move the exact timestamp out of the system prompt (which busted llama.rn's prefix cache every turn) and onto the latest user message; keep only the stable date in the system prompt so the ~800-token system+tools prefix is reused. Add tool-loop diagnostic logging.

Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>
…tream fast

On CPU-only low-RAM devices, reasoning/preamble burned the whole token
budget before any summary streamed. Force the thinking channel off for
internal generations (summarize, tool-selection), instruct the prompts to
emit no preamble, strip any leaked control tokens, and cap each MAP chunk
to ~1500 input tokens so each part prefills in under a minute and starts
streaming sooner. Reduce/combine passes still use the full context budget.

Co-Authored-By: Dishit Karia <hanmadishit74@gmail.com>
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds end-to-end transcript attachment summarization to the chat composer. A new TranscriptSummarizerService performs map-reduce LLM summarization with token streaming. MediaAttachment gains transcript slice fields, the RAG pipeline gains per-chunk metadata, a pending attachment inbox enables pre-seeding, and ChatInput gains a useSummarizeAttachment hook plus "Summarize"/"Summarizing" UI in AttachmentPreview. generationToolLoop separates stable date-only system prompt context from precise-time user message injection.

Changes

Transcript Attachment Summarization

Layer / File(s) Summary
MediaAttachment types and RAG chunk metadata pipeline
src/types/index.ts, src/services/rag/chunking.ts, src/services/rag/database.ts, src/services/rag/retrieval.ts, src/services/rag/index.ts
MediaAttachment gains recordingId, transcriptStartMs, transcriptEndMs. Chunk, RagSearchResult, and StoredEmbedding gain optional metadata. rag_chunks table is migrated to include metadata TEXT, insertion serializes it as JSON, and select queries and retrieval propagate it. RagService.indexText is added to index pre-built in-memory chunks.
LLM streaming and TranscriptSummarizerService
src/services/llm.ts, src/services/transcriptSummarizer.ts, src/services/index.ts
LLMService.generateWithMaxTokens gains an onToken callback with thinking disabled. TranscriptSummarizerService implements map-reduce summarization with splitIntoChunks, progress events (SummarizeProgress), and a summarizeOne helper that strips control tokens. Singleton transcriptSummarizer and types are re-exported from src/services/index.ts.
Pending attachment inbox and idempotent project seeding
src/services/chatAttachmentInbox.ts, src/stores/projectStore.ts, src/services/index.ts
New chatAttachmentInbox module provides setPendingChatAttachments/takePendingChatAttachments for a one-time handoff queue. ensureProject is made idempotent: it accepts a fixed id and returns early if that project already exists.
generationToolLoop date/time prompt refactor
src/services/generationToolLoop.ts
Replaces buildDateTimeContext with nowParts/buildDateContext/buildExactTimeNote. Stable date-only text goes into the system prompt; precise time is appended to the latest user message only when calendar tools are active. Adds pre-LLM debug logging of tool counts and augmentation state.
ChatInput summarization hook and UI
src/components/ChatInput/useSummarizeAttachment.ts, src/components/ChatInput/Attachments.tsx, src/components/ChatInput/styles.ts, src/components/ChatInput/index.tsx
useSummarizeAttachment hook manages summarizingId state, creates user+assistant messages, streams tokens via throttled flushes, and resets on completion or error. useAttachments seeds from takePendingChatAttachments(). AttachmentPreview gains onSummarize/summarizingId props and renders a "Summarize" button or busy indicator for non-image attachments. New styles support wider document chips and summarize action layout. ChatInput wires the hook and removes the attachment post-summarization.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant AttachmentPreview
  participant useSummarizeAttachment
  participant TranscriptSummarizerService
  participant LLMService

  User->>AttachmentPreview: tap Summarize
  AttachmentPreview->>useSummarizeAttachment: handleSummarize(attachment)
  useSummarizeAttachment->>useSummarizeAttachment: add user msg + assistant placeholder, set summarizingId
  useSummarizeAttachment->>TranscriptSummarizerService: summarize(text, {onProgress, onToken})
  loop map phase per chunk
    TranscriptSummarizerService->>LLMService: generateWithMaxTokens(chunk, onToken)
    LLMService-->>TranscriptSummarizerService: token deltas
    TranscriptSummarizerService-->>useSummarizeAttachment: onToken → scheduleFlush → updateMessageContent
  end
  TranscriptSummarizerService->>LLMService: generateWithMaxTokens(combine pass, onToken)
  TranscriptSummarizerService-->>useSummarizeAttachment: onProgress(done, summary)
  useSummarizeAttachment->>useSummarizeAttachment: finalize assistant message, clear summarizingId
  useSummarizeAttachment->>AttachmentPreview: removes attachment from input
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 Hop, hop! A transcript arrives,
The rabbit maps it, chunk by chunk it thrives.
Reduce, combine — the summary grows,
Streamed to the chat as the progress flows.
Summarize tapped, the pill spins round,
Then poof — the attachment is no longer found! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description explains the changes but ignores the required template and omits Summary, Type of Change, checklist, screenshots, related issues, and notes. Rewrite the PR description using the repository template and fill in all required sections, especially type of change, testing, screenshots, related issues, and additional notes.
✅ Passed checks (4 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title is concise and accurately captures the main changes around attachment summarization and recordings RAG.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch core-chat-rag

Comment @coderabbitai help to get the list of available commands.

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

feat(chat): map-reduce summarizer, RAG metadata, TTFT cache fix

✨ Enhancement 🕐 40+ Minutes

Grey Divider

AI Description

• Adds a transcriptSummarizer service implementing map-reduce summarization for oversized
 documents/transcripts, with live-streamed progress per chunk and a final combine pass.
• Adds a "Summarize" chip to document attachment previews (wider chip, spinner while busy) and a
 useSummarizeAttachment hook that streams progress into an assistant message.
• Fixes high TTFT by splitting the date/time context: stable date-only goes in the system prompt
 (cacheable), exact time appended to the latest user message only when calendar tools are active.
• Adds metadata column to RAG chunks (with migration for existing DBs) and a new
 ragService.indexText() method for indexing in-memory transcript chunks with per-chunk metadata
 (recordingId, startMs, eventTitle).
• Adds chatAttachmentInbox module-level store for cross-navigation attachment hand-off (Pro
 recorder → chat composer) without serializing large text through route params.
• Adds ensureProject to projectStore for idempotent seeding of system projects (e.g.
 "Recordings" knowledge base).
• Disables model thinking and caps map chunk size in generateWithMaxTokens to keep summarization
 fast on CPU-only low-RAM devices.
Diagram

graph TD
    A["ChatInput"] --> B["AttachmentPreview"]
    A --> C["useSummarizeAttachment"]
    C --> D(["transcriptSummarizer"])
    D --> E(["llmService"])
    C --> F[("chatStore")]
    G["chatAttachmentInbox"] --> B
    H["generationToolLoop"] --> E
    D --> E
    I["ragService.indexText"] --> J[("RagDatabase")]
    I --> K(["embeddingService"])
    J --> L[("rag_chunks + metadata")]
    M["projectStore.ensureProject"] --> N[("projectStore")]

    subgraph Legend
      direction LR
      _svc([Service]) ~~~ _db[(Database)] ~~~ _mod[Module]
    end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Rolling/sliding-window summary
  • ➕ Single LLM call at a time, lower peak memory
  • ➕ First token appears faster for very long inputs
  • ➖ Loses global context — early content is progressively compressed away
  • ➖ No natural 'combining' phase, harder to show meaningful progress
2. Zustand store for attachment hand-off instead of module-level variable
  • ➕ Persists across hot reloads in dev
  • ➕ Easier to observe/debug with devtools
  • ➖ More boilerplate for a one-shot pattern
  • ➖ Overkill — the inbox is intentionally consumed-and-cleared on mount

Recommendation: The map-reduce approach is the right choice for arbitrarily large transcripts. One alternative worth considering is a sliding-window / rolling summary (always one LLM call in flight, no multi-pass), which would reduce latency for very long inputs but loses global coherence. The current approach is preferable for accuracy. The chatAttachmentInbox module-level store is a pragmatic solution to the cross-navigation hand-off problem; a route-param approach would be simpler but fails for large payloads, and a shared Zustand store would work too but adds more boilerplate — the chosen pattern is the lightest correct solution.

Files changed (15) +644 / -64

Enhancement (9) +176 / -41
Attachments.tsxAdd Summarize chip and inbox seed to attachment preview +70/-32

Add Summarize chip and inbox seed to attachment preview

• Document/transcript chips now render a wider (168×76) card with a 'Summarize' button (zap icon) or an 'ActivityIndicator' while busy. The 'useAttachments' hook seeds its initial state from 'chatAttachmentInbox' on mount. 'AttachmentPreview' gains optional 'onSummarize' and 'summarizingId' props.

src/components/ChatInput/Attachments.tsx

index.tsxWire useSummarizeAttachment into ChatInput +12/-1

Wire useSummarizeAttachment into ChatInput

• Instantiates 'useSummarizeAttachment', wraps it in 'onSummarizeAttachment' (which also removes the attachment after summarizing), and passes 'onSummarize' / 'summarizingId' down to 'AttachmentPreview'.

src/components/ChatInput/index.tsx

styles.tsAdd styles for wider document chip and Summarize button +45/-0

Add styles for wider document chip and Summarize button

• Adds 'attachmentPreviewDoc' (168×76 chip), 'documentPreviewDoc', 'documentNameRow', 'summarizeButton', 'summarizeButtonText', 'summarizeBusy', and 'summarizeBusyText' style entries to support the new summarize UI.

src/components/ChatInput/styles.ts

llm.tsAdd onToken streaming callback and disable thinking in generateWithMaxTokens +13/-4

Add onToken streaming callback and disable thinking in generateWithMaxTokens

• Adds an optional 'onToken' callback to 'generateWithMaxTokens' so callers can stream output token-by-token. Forces thinking off (via 'buildThinkingCompletionParams(false, ...)') for all internal summarization/tool-selection generations to avoid wasted token budget.

src/services/llm.ts

chunking.tsAdd optional metadata field to Chunk interface +3/-0

Add optional metadata field to Chunk interface

• Extends the 'Chunk' interface with an optional 'metadata?: Record<string, unknown>' field so callers (e.g. recording indexer) can attach per-chunk context such as 'recordingId', 'startMs', and 'eventTitle'.

src/services/rag/chunking.ts

database.tsAdd metadata column to rag_chunks with backward-compatible migration +15/-4

Add metadata column to rag_chunks with backward-compatible migration

• Adds a 'metadata TEXT' column to 'rag_chunks' in the schema and performs an 'ALTER TABLE ADD COLUMN' migration for existing installs (ignoring the 'duplicate column' error). Updates 'insertChunks', 'getEmbeddingsByProject', and 'getChunksByProject' to read/write the new column. Propagates 'metadata' through 'RagSearchResult' and 'StoredEmbedding' interfaces.

src/services/rag/database.ts

retrieval.tsPropagate chunk metadata through retrieval results +1/-0

Propagate chunk metadata through retrieval results

• Passes 'entry.metadata' into the 'RagSearchResult' returned by cosine-similarity search so callers can access per-chunk source metadata (recordingId, startMs, etc.) from search hits.

src/services/rag/retrieval.ts

projectStore.tsAdd ensureProject for idempotent system project seeding +10/-0

Add ensureProject for idempotent system project seeding

• Adds 'ensureProject(project)' to 'ProjectState' and its implementation: a no-op if a project with the given id already exists, otherwise inserts it. Used to seed the 'Recordings' knowledge-base project on first run.

src/stores/projectStore.ts

index.tsAdd transcript range metadata fields to MediaAttachment +7/-0

Add transcript range metadata fields to MediaAttachment

• Extends 'MediaAttachment' with 'recordingId?', 'transcriptStartMs?', and 'transcriptEndMs?' so a transcript attachment can carry its source recording identity and the time range it covers, enabling cite-and-seek from chat back into the audio.

src/types/index.ts

Bug fix (1) +62 / -22
generationToolLoop.tsFix TTFT by separating stable date from volatile time in prompts +62/-22

Fix TTFT by separating stable date from volatile time in prompts

• Splits 'buildDateTimeContext' into 'buildDateContext' (date-only, injected into the system prompt so the ~800-token prefix stays cacheable across turns) and 'buildExactTimeNote' (full timestamp, appended to the latest user message only when calendar tools are active). Adds diagnostic logging for augmentation decisions.

src/services/generationToolLoop.ts

Other (5) +406 / -1
transcriptSummarizer.tsNew map-reduce transcript summarizer service +215/-0

New map-reduce transcript summarizer service

• Implements a 'TranscriptSummarizerService' that splits arbitrarily large text into context-sized chunks, summarizes each (map), then recursively combines partial summaries (reduce) until a single summary fits. Emits typed 'SummarizeProgress' events (chunking, mapping, reducing, combining, done, error) and accepts an 'onToken' callback for live streaming. Caps map chunk size at ~1500 tokens to minimize TTFT on CPU-only devices and forces thinking off via dedicated system prompts.

src/services/transcriptSummarizer.ts

useSummarizeAttachment.tsNew hook to stream summarization progress into a chat message +124/-0

New hook to stream summarization progress into a chat message

• Provides 'useSummarizeAttachment()' which posts a user message and a placeholder assistant message, then streams map-phase parts and the final combine pass into that message via a 50ms flush timer. Handles conversation creation if none is active, formats transcript time ranges (mm:ss), and clears the attachment after completion.

src/components/ChatInput/useSummarizeAttachment.ts

chatAttachmentInbox.tsNew module-level inbox for cross-navigation attachment hand-off +27/-0

New module-level inbox for cross-navigation attachment hand-off

• Provides 'setPendingChatAttachments' / 'takePendingChatAttachments' as a one-shot in-memory store so the Pro recorder can hand a transcript attachment to the chat composer across a navigation boundary without serializing large text through route params and without core importing Pro code.

src/services/chatAttachmentInbox.ts

index.tsAdd ragService.indexText() for in-memory chunk indexing +37/-1

Add ragService.indexText() for in-memory chunk indexing

• Adds 'indexText({ projectId, docName, docPath, chunks, fileSize? })' to 'RagService', which inserts pre-built chunks (with metadata) directly without reading a file, then generates and stores embeddings. Intended for recording transcripts handed in as structured chunks.

src/services/rag/index.ts

index.tsExport transcriptSummarizer and chatAttachmentInbox from services barrel +3/-0

Export transcriptSummarizer and chatAttachmentInbox from services barrel

• Adds public exports for 'transcriptSummarizer', 'SummarizeProgress', 'setPendingChatAttachments', and 'takePendingChatAttachments' so consumers can import them from the top-level services index.

src/services/index.ts

@qodo-code-review

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📜 Skill insights (0)

Grey Divider


Action required

1. Attachment removed on failure 🐞 Bug ≡ Correctness
Description
ChatInput always calls removeAttachment() after awaiting handleSummarize(), but handleSummarize()
can return early (busy/no text/no model) and also swallows errors, so attachments can be discarded
even when no summary was produced. This causes silent data loss and prevents retrying summarization.
Code

src/components/ChatInput/index.tsx[R107-111]

+  const { summarizingId, handleSummarize } = useSummarizeAttachment();
+  const onSummarizeAttachment = async (attachment: MediaAttachment) => {
+    await handleSummarize(attachment);
+    removeAttachment(attachment.id);
+  };
Evidence
The new onSummarizeAttachment wrapper always removes the attachment after awaiting
handleSummarize(), while handleSummarize() contains multiple early-return paths and catches errors
without throwing, so the caller cannot distinguish success from no-op/failure.

src/components/ChatInput/index.tsx[105-111]
src/components/ChatInput/useSummarizeAttachment.ts[29-41]
src/components/ChatInput/useSummarizeAttachment.ts[78-120]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ChatInput` removes an attachment unconditionally after `handleSummarize()`, but `handleSummarize()` can no-op (early return) or fail (caught internally). This can delete the user’s attachment without producing a summary.

## Issue Context
- `onSummarizeAttachment` always calls `removeAttachment(attachment.id)` after `await handleSummarize(attachment)`.
- `handleSummarize` returns early when already summarizing, when `textContent` is empty, or when there is no model/conversation; it also catches errors and does not rethrow.

## Fix Focus Areas
- src/components/ChatInput/index.tsx[107-111]
- src/components/ChatInput/useSummarizeAttachment.ts[29-41]
- src/components/ChatInput/useSummarizeAttachment.ts[78-120]

## Suggested fix
- Change `handleSummarize` to return a status (e.g., `Promise<boolean>`), where `true` means a summary was successfully generated and posted.
 - Return `false` on early exits (busy/no text/no model).
 - Either rethrow on failure or return `false` on failure.
- In `onSummarizeAttachment`, only remove the attachment when `handleSummarize` indicates success.
- Optionally surface an alert/toast when summarization can’t run (e.g., no model loaded).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Summarize double-tap race 🐞 Bug ☼ Reliability
Description
useSummarizeAttachment uses React state (summarizingId) as its only mutual-exclusion guard, but
setSummarizingId() is not synchronous, so rapid taps can enter handleSummarize() multiple times
before the state updates. This can trigger concurrent generation attempts (and downstream
“Generation already in progress” failures) and interacts badly with unconditional attachment
removal.
Code

src/components/ChatInput/useSummarizeAttachment.ts[R29-52]

+  const handleSummarize = async (attachment: MediaAttachment): Promise<void> => {
+    if (summarizingId) return;
+    const text = attachment.textContent?.trim();
+    if (!text) return;
+
+    const chat = useChatStore.getState();
+    let conversationId = chat.activeConversationId;
+    if (!conversationId) {
+      const modelId = useAppStore.getState().activeModelId;
+      if (!modelId) return; // no model loaded - nothing to summarize with
+      conversationId = chat.createConversation(modelId);
+      chat.setActiveConversation(conversationId);
+    }
+
+    const label = attachment.fileName || 'transcript';
+    const range =
+      attachment.transcriptStartMs != null && attachment.transcriptEndMs != null
+        ? ` (${fmtClock(attachment.transcriptStartMs)} to ${fmtClock(attachment.transcriptEndMs)})`
+        : '';
+    chat.addMessage(conversationId, { role: 'user', content: `Summarize ${label}${range}` });
+    const placeholder = chat.addMessage(conversationId, { role: 'assistant', content: 'Starting...' });
+
+    setSummarizingId(attachment.id);
+    // Stream the work in place. The map phase streams each part as it is written
Evidence
The function checks summarizingId then only later calls setSummarizingId, creating a window
where a second invocation can proceed before React applies the state update.

src/components/ChatInput/useSummarizeAttachment.ts[27-52]
src/components/ChatInput/Attachments.tsx[159-206]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`handleSummarize` guards concurrent runs using `summarizingId` in React state, but state updates are not applied synchronously. A user can trigger multiple summarizations via rapid taps before the UI disables the button.

## Issue Context
- The guard `if (summarizingId) return;` is evaluated before `setSummarizingId(attachment.id)` takes effect.
- The summarize button’s busy UI is driven by `summarizingId`, so it also won’t disable instantly.

## Fix Focus Areas
- src/components/ChatInput/useSummarizeAttachment.ts[29-52]
- src/components/ChatInput/Attachments.tsx[159-206]

## Suggested fix
- Add a synchronous lock using `useRef` (e.g., `inFlightRef.current`) set immediately at the start of `handleSummarize` and cleared in `finally`.
- Consider disabling *all* summarize buttons when any summarize is in-flight (e.g., pass a boolean like `isSummarizingAny = summarizingId != null` to `AttachmentPreview` and gate the onPress).
- Optionally leverage `transcriptSummarizer.isSummarizing` as an additional guard to prevent concurrent runs across components.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment on lines +107 to +111
const { summarizingId, handleSummarize } = useSummarizeAttachment();
const onSummarizeAttachment = async (attachment: MediaAttachment) => {
await handleSummarize(attachment);
removeAttachment(attachment.id);
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Attachment removed on failure 🐞 Bug ≡ Correctness

ChatInput always calls removeAttachment() after awaiting handleSummarize(), but handleSummarize()
can return early (busy/no text/no model) and also swallows errors, so attachments can be discarded
even when no summary was produced. This causes silent data loss and prevents retrying summarization.
Agent Prompt
## Issue description
`ChatInput` removes an attachment unconditionally after `handleSummarize()`, but `handleSummarize()` can no-op (early return) or fail (caught internally). This can delete the user’s attachment without producing a summary.

## Issue Context
- `onSummarizeAttachment` always calls `removeAttachment(attachment.id)` after `await handleSummarize(attachment)`.
- `handleSummarize` returns early when already summarizing, when `textContent` is empty, or when there is no model/conversation; it also catches errors and does not rethrow.

## Fix Focus Areas
- src/components/ChatInput/index.tsx[107-111]
- src/components/ChatInput/useSummarizeAttachment.ts[29-41]
- src/components/ChatInput/useSummarizeAttachment.ts[78-120]

## Suggested fix
- Change `handleSummarize` to return a status (e.g., `Promise<boolean>`), where `true` means a summary was successfully generated and posted.
  - Return `false` on early exits (busy/no text/no model).
  - Either rethrow on failure or return `false` on failure.
- In `onSummarizeAttachment`, only remove the attachment when `handleSummarize` indicates success.
- Optionally surface an alert/toast when summarization can’t run (e.g., no model loaded).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@qodo-code-review

Copy link
Copy Markdown

CI Feedback 🧐

A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

Action: test

Failed stage: Run Jest tests [❌]

Failed test name: runToolLoop — precise date/time context for calendar tools › augments the system prompt with a precise timestamp when a calendar tool is enabled

Failure summary:

The action failed because the Jest test run exited with code 1 due to failing tests (not because of
the Node deprecation warnings).

Failing tests reported in the log:
- tests/unit/services/generationToolLoop.branches.test.ts:

- Test runToolLoop — precise date/time context for calendar tools › augments the system prompt with
a precise timestamp when a calendar tool is enabled failed because the generated system prompt did
not include the expected substring current date and time is.
- The received prompt only contained
a date-only string: The current date is 2026-06-30 ... (no time component).
-
tests/unit/services/rag/database.test.ts:89:34:
- An assertion failed at line 89, where
expect(chunkInserts[0][1]).toEqual(['chunk one', 42, 0]) did not match the actual value (the diff
shows an unexpected extra null element appended in the received array), indicating the code under
test is inserting/returning an extra null field in the chunk insert parameters.

Notes:
- The messages about Node 20 deprecation are warnings; the workflow was already failing due
to the Jest failures above, which caused the final Process completed with exit code 1.

Relevant error logs:
1:  ##[group]Runner Image Provisioner
2:  Hosted Compute Agent
...

178:  Attempting to download 20...
179:  (node:13238) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
180:  (Use `node --trace-deprecation ...` to show where the warning was created)
181:  Acquiring 20.20.2 - arm64 from https://github.com/actions/node-versions/releases/download/20.20.2-23521894959/node-20.20.2-darwin-arm64.tar.gz
182:  Extracting ...
183:  [command]/usr/bin/tar xz --strip 1 -C /Users/runner/work/_temp/58f609e1-95d1-4b19-8515-7b1fec1a6896 -f /Users/runner/work/_temp/bfdfb627-1499-4b34-9d02-6b9c5d5644ac
184:  Adding to the cache ...
185:  ##[group]Environment details
186:  node: v20.20.2
187:  npm: 10.8.2
188:  yarn: 1.22.22
189:  ##[endgroup]
190:  [command]/Users/runner/hostedtoolcache/node/20.20.2/arm64/bin/npm config get cache
191:  /Users/runner/.npm
192:  Cache hit for: node-cache-macOS-arm64-npm-90232c26f19cbee90d7152529f9d769d89f5c0aa61b3e44733d05d5293ed25be
193:  (node:13238) [DEP0169] DeprecationWarning: `url.parse()` behavior is not standardized and prone to errors that have security implications. Use the WHATWG URL API instead. CVEs are not issued for `url.parse()` vulnerabilities.
194:  Received 41943040 of 198510236 (21.1%), 39.9 MBs/sec
...

271:  env:
272:  JAVA_HOME: /Users/runner/hostedtoolcache/Java_Temurin-Hotspot_jdk/17.0.19-10/arm64/Contents/Home
273:  JAVA_HOME_17_ARM64: /Users/runner/hostedtoolcache/Java_Temurin-Hotspot_jdk/17.0.19-10/arm64/Contents/Home
274:  ##[endgroup]
275:  PASS __tests__/unit/services/modelManager.test.ts (13.006 s)
276:  ModelManager
277:  initialize
278:  ✓ creates models directories when they do not exist (5 ms)
279:  ✓ does not create dirs when they already exist (1 ms)
280:  ✓ excludes model directories from iCloud backup on initialize (1 ms)
281:  getDownloadedModels
282:  ✓ returns empty array when nothing stored (1 ms)
283:  ✓ returns stored models that exist on disk
284:  ✓ filters out models whose files no longer exist
285:  ✓ updates storage when invalid entries are removed (1 ms)
286:  ✓ returns empty array on parse error (1 ms)
287:  deleteModel
...

298:  ✓ returns free space from RNFS
299:  getOrphanedFiles
300:  ✓ finds untracked GGUF files (1 ms)
301:  ✓ excludes tracked files
302:  ✓ returns empty array when directory is empty (1 ms)
303:  ✓ finds orphaned image model directories
304:  determineCredibility
305:  ✓ recognizes lmstudio-community source
306:  ✓ recognizes official model authors
307:  ✓ recognizes verified quantizers (1 ms)
308:  ✓ defaults to community for unknown authors
309:  downloadModelBackground
310:  ✓ throws when not supported (17 ms)
311:  ✓ skips download when files already exist (4 ms)
312:  ✓ starts background download for main model (1 ms)
313:  ✓ sets up progress listener during start and complete/error via watchDownload (2 ms)
314:  ✓ calls metadata callback with download info (1 ms)
315:  ✓ downloads mmproj in parallel via startDownload when present (1 ms)
316:  ✓ uses file.downloadUrl when set (cross-repo curated entries) (1 ms)
317:  resetMmProjForRetry
318:  ✓ restores mmproj completion flags and local path for retried sidecars
319:  ✓ leaves entries without mmproj download untouched
320:  syncBackgroundDownloads
321:  ✓ returns empty when not supported
322:  ✓ processes completed downloads
323:  ✓ clears failed downloads
324:  ✓ skips downloads with no metadata (6 ms)
...

344:  ✓ detects clip .gguf filenames
345:  ✓ rejects non-mmproj filenames
346:  ✓ is case-insensitive
347:  cleanupMMProjEntries
348:  ✓ removes mmproj entries from models list (1 ms)
349:  ✓ handles empty model list
350:  ✓ links orphaned mmproj files to matching vision models
351:  ✓ returns count of removed entries (1 ms)
352:  importLocalModel
353:  ✓ imports valid .gguf file successfully (1 ms)
354:  ✓ rejects non-.gguf files (13 ms)
355:  ✓ rejects when destination already exists (1 ms)
356:  ✓ parses quantization from filename (1 ms)
357:  ✓ sets quantization to Unknown when not parseable
358:  ✓ adds imported model to storage (1 ms)
359:  ✓ handles copy failure gracefully
360:  ✓ reports progress during copy
361:  refreshModelLists
362:  ✓ calls both scan functions and returns combined results
363:  ✓ returns existing models even when scan finds nothing new
364:  saveModelWithMmproj
365:  ✓ updates model with mmproj info and persists (1 ms)
366:  ✓ derives mmProjFileSize from RNFS.stat
367:  deleteOrphanedFile when file does not exist
368:  ✓ handles missing file gracefully (1 ms)
369:  cancelBackgroundDownload when not supported
370:  ✓ throws when background service is unavailable
371:  scanForUntrackedTextModels tiny files
372:  ✓ skips files smaller than 1MB
373:  getOrphanedFiles with directory read error
374:  ✓ returns empty when image model dir read fails
375:  deleteModel mmProjPath catch branch
376:  ✓ continues when mmProjPath deletion fails (1 ms)
377:  getDownloadedModels path re-resolution
378:  ✓ re-resolves text model path when original path not found (5 ms)
379:  ✓ re-resolves mmProjPath when original path not found
380:  getDownloadedImageModels path re-resolution
381:  ✓ re-resolves image model path when original not found (1 ms)
382:  getOrphanedFiles image model isFile branch
383:  ✓ uses file size directly for orphaned image model files (5 ms)
384:  scanForUntrackedImageModels coreml backend detection
385:  ✓ detects coreml backend from directory name
386:  ✓ skips empty directories (1 ms)
387:  scanForUntrackedImageModels readDir error
388:  ✓ skips directory when readDir fails
389:  scanForUntrackedImageModels skips non-directories
390:  ✓ skips files in image models directory
391:  downloadModelBackground complete handler
392:  ✓ processes completed background download with mmproj (9 ms)
393:  downloadModelBackground error handler
394:  ✓ calls onError when background download fails (5 ms)
395:  repairMmProj
...

413:  ✓ returns model path when found
414:  ✓ returns null when model not found
415:  getImageModelsStorageUsed
416:  ✓ returns total storage used by image models
417:  ✓ returns 0 when no image models
418:  addDownloadedImageModel
419:  ✓ adds new image model to registry
420:  ✓ replaces existing image model with same ID
421:  scanForUntrackedTextModels edge cases
422:  ✓ returns empty when directory does not exist
423:  ✓ discovers untracked GGUF files (3 ms)
424:  ✓ skips mmproj files (1 ms)
425:  ✓ skips tiny files
426:  ✓ skips already registered models
427:  ✓ handles string file sizes (1 ms)
428:  ✓ catches errors during scan
429:  scanForUntrackedImageModels edge cases
430:  ✓ returns empty when directory does not exist
431:  ✓ discovers untracked image model directories
432:  ✓ detects qnn backend from directory name (1 ms)
433:  ✓ detects coreml backend from directory name
434:  ✓ skips directories with 0 size
435:  ✓ skips already registered model directories
436:  ✓ handles string file sizes in model directory (1 ms)
437:  importLocalModel additional branches
438:  ✓ replaces existing model with same ID in registry
439:  deleteOrphanedFile
440:  ✓ deletes file that exists (3 ms)
441:  ✓ does nothing when file does not exist
442:  ✓ throws when deletion fails (1 ms)
443:  getDownloadedImageModels
444:  ✓ returns empty array when no stored data
445:  ✓ filters out models whose files no longer exist
446:  setBackgroundDownloadMetadataCallback
447:  ✓ stores the callback
448:  importLocalModel — Android content:// URI handling
449:  ✓ copies content:// URI directly to models dir on Android (no temp cache) (4 ms)
450:  copyFileWithProgress — poll interval callback
451:  ✓ fires progress callback via setInterval poll during copy (1 ms)
452:  buildDownloadedModel
453:  ✓ sets mmProjFileName when mmproj file exists
454:  ✓ sets mmProjFileName from expectedMmProjFileName when mmproj download failed
455:  ✓ omits mmProjFileName when model has no vision support
...

461:  ✓ re-unzips from valid zip when _zip_name present and zip valid
462:  ✓ deletes partial dir when _zip_name present but zip is missing
463:  ✓ deletes stale dir when neither _ready nor _zip_name exist
464:  ✓ resolves CoreML model path via resolveCoreMLModelDir (3 ms)
465:  importLocalModel — LiteRT branches
466:  ✓ imports a .litertlm file with engine=litert and liteRTVision=false (1 ms)
467:  ✓ imports a .litertlm file with liteRTVision=true
468:  ✓ omits engine and liteRTVision when not provided
469:  PASS __tests__/unit/services/llm.test.ts
470:  LLMService
471:  loadModel
472:  ✓ calls initLlama with correct parameters (6 ms)
473:  ✓ throws when model file not found (56 ms)
474:  ✓ skips loading if same model already loaded (2 ms)
475:  ✓ unloads existing model before loading different one (2 ms)
476:  ✓ falls back to CPU when GPU init fails (2 ms)
477:  ✓ falls back to smaller context when CPU also fails
478:  ✓ warns when mmproj file not found but continues (2 ms)
479:  ✓ initializes multimodal when mmproj path provided and exists (1 ms)
480:  ✓ reads settings from appStore (1 ms)
481:  ✓ uses llama.rn jinja support to detect thinking support (1 ms)
482:  ✓ uses flashAttn=true from store and sets q8_0 KV cache (4 ms)
483:  ✓ uses flashAttn=false from store and sets f16 KV cache when cacheType is f16 (1 ms)
484:  ✓ falls back to platform default when flashAttn is undefined (iOS → flash attn ON) (1 ms)
485:  ✓ captures GPU status from context (1 ms)
486:  ✓ resets state on final error (1 ms)
487:  initializeMultimodal
488:  ✓ returns false when no context (1 ms)
489:  ✓ calls context.initMultimodal with correct path (1 ms)
490:  ✓ sets vision support on success (1 ms)
491:  ✓ returns false on initMultimodal failure
492:  ✓ handles exception gracefully (1 ms)
493:  unloadModel
494:  ✓ releases context and resets state
495:  ✓ is safe when no model loaded (1 ms)
496:  generateResponse
497:  ✓ throws when no model loaded (18 ms)
498:  ✓ throws when generation already in progress (1 ms)
499:  ✓ streams tokens via onStream callback (1 ms)
500:  ✓ returns full response and calls onComplete (1 ms)
501:  ✓ updates performance stats (1 ms)
502:  ✓ resets isGenerating on error (8 ms)
503:  ✓ uses messages format for text-only path (2 ms)
...

537:  ✓ formats user message with ChatML tags (1 ms)
538:  ✓ formats assistant message with ChatML tags (1 ms)
539:  ✓ ends with assistant prefix for generation (1 ms)
540:  ✓ preserves message order (2 ms)
541:  convertToOAIMessages
542:  ✓ converts text-only message to simple format
543:  ✓ converts message with images to multipart format (1 ms)
544:  ✓ adds file:// prefix to local image URIs (2 ms)
545:  ✓ preserves file:// prefix when already present
546:  ✓ handles multiple images in one message
547:  ✓ does not convert assistant messages with images (1 ms)
548:  context window tokenize fallback
549:  ✓ uses char/4 estimation when tokenize throws (2 ms)
550:  reloadWithSettings
551:  ✓ unloads existing model and reloads with new settings (5 ms)
552:  ✓ resets state on reload failure when all attempts fail (20 ms)
553:  hashString
554:  ✓ returns consistent hash for same input (3 ms)
555:  ✓ returns different hashes for different inputs (2 ms)
556:  getModelInfo
557:  ✓ returns null without model loaded (1 ms)
558:  ✓ returns info when model loaded (2 ms)
559:  vision support helpers
560:  ✓ supportsVision returns false when no model loaded (1 ms)
561:  ✓ getMultimodalSupport returns null when no model loaded (1 ms)
562:  stopGeneration error branch
563:  ✓ handles stopCompletion error gracefully (3 ms)
564:  clearKVCache error branch
565:  ✓ handles clearCache error gracefully (1 ms)
566:  ensureSessionCacheDir branches
567:  ✓ creates dir when it does not exist (1 ms)
568:  getGpuInfo Android branches
569:  ✓ returns OpenCL when OpenCL backend selected on Android with no devices (3 ms)
570:  ✓ returns device names when OpenCL backend selected on Android with devices (4 ms)
571:  getTokenCount
572:  ✓ returns token count for text (2 ms)
573:  ✓ returns 0 when tokens is undefined (4 ms)
574:  ✓ throws when no model loaded (4 ms)
575:  convertToOAIMessages empty content branch
576:  ✓ skips text part when message content is empty (1 ms)
577:  checkMultimodalSupport branches
578:  ✓ returns false when no context (1 ms)
579:  ✓ returns support from getMultimodalSupport when available (2 ms)
580:  ✓ handles getMultimodalSupport not being a function (2 ms)
581:  ✓ handles getMultimodalSupport throwing error (4 ms)
582:  loadModel metadata branches
583:  ✓ reads model metadata and logs context length warning (4 ms)
584:  ✓ handles metadata without context_length (1 ms)
585:  ✓ handles null model metadata (1 ms)
586:  reloadWithSettings flash attention
587:  ✓ passes flashAttn=true from store to reloadWithSettings (1 ms)
588:  ✓ passes flashAttn=false and cacheType=f16 from store to reloadWithSettings (1 ms)
589:  ✓ falls back to platform default in reloadWithSettings when flashAttn is undefined (iOS → ON) (1 ms)
590:  reloadWithSettings GPU fallback
591:  ✓ falls back to CPU when GPU reload fails (1 ms)
592:  loadModel without mmproj calls checkMultimodalSupport
593:  ✓ calls checkMultimodalSupport when no mmproj provided (1 ms)
594:  formatMessages with vision attachments
595:  ✓ adds image markers when vision is supported (1 ms)
596:  loadModel mmproj file size warning
597:  ✓ warns when mmproj file is suspiciously small (1 ms)
598:  ✓ does not warn when mmproj file is large enough (1 ms)
599:  ✓ handles stat error for mmproj file (3 ms)
600:  generateResponse with vision mode
...

603:  generateResponse uses store settings
604:  ✓ applies temperature from settings (1 ms)
605:  getContextDebugInfo
606:  ✓ returns debug info about context usage (2 ms)
607:  ✓ shows truncation info when messages are truncated (1 ms)
608:  ✓ uses char/4 estimation when tokenize throws in debug info
609:  reloadWithSettings with GPU disabled
610:  ✓ skips GPU attempt when GPU is disabled (2 ms)
611:  performance stats
612:  ✓ returns zero stats before any generation (1 ms)
613:  ✓ returns a copy of settings (not reference)
614:  ✓ returns a copy of stats (not reference) (3 ms)
615:  initializeMultimodal GPU usage based on device
616:  ✓ disables GPU for CLIP on iOS simulator (1 ms)
617:  ✓ enables GPU for CLIP on real iOS device
618:  loadModel error message wrapping
619:  ✓ wraps error with custom message (2 ms)
620:  ✓ handles error without message property
621:  unloadModel resets all state
622:  ✓ resets GPU info after unload (1 ms)
623:  getOptimalThreadCount and getOptimalBatchSize fallbacks
624:  ✓ uses getOptimalThreadCount when nThreads is 0 (2 ms)
625:  ✓ uses getOptimalBatchSize when nBatch is 0
626:  ensureSessionCacheDir
627:  ✓ creates directory when it does not exist (1 ms)
628:  ✓ skips mkdir when directory already exists
629:  ✓ catches and logs errors without throwing (1 ms)
630:  getSessionPath
...

727:  project management
728:  ✓ shows project hint in empty chat state (15 ms)
729:  ✓ shows "Default" when no project assigned (12 ms)
730:  ✓ shows project name in settings modal when project is assigned (21 ms)
731:  ✓ opens project selector from settings modal (28 ms)
732:  ✓ assigns project to conversation when selected (78 ms)
733:  ✓ clears project when Default is selected (47 ms)
734:  image generation progress
735:  ✓ shows image generation progress indicator when generating (9 ms)
736:  ✓ shows "Refining Image" when preview is available (14 ms)
737:  ✓ does not show progress indicator when not generating (13 ms)
738:  model selector modal
739:  ✓ opens model selector from header via the manager sheet (17 ms)
740:  ✓ closes model selector when close is pressed (23 ms)
741:  ✓ handles model selection with memory check (107 ms)
742:  ✓ shows alert when memory check fails (42 ms)
743:  ✓ shows warning alert with Load Anyway option for low memory (28 ms)
744:  ✓ handles unload model from selector without crash (36 ms)
745:  settings modal
746:  ✓ opens settings modal from header icon (26 ms)
747:  ✓ closes settings modal (31 ms)
748:  ✓ does not show delete button when no active conversation (1 ms)
749:  ✓ shows gallery button when conversation has images (30 ms)
750:  conversation with images
751:  ✓ counts images in conversation messages (36 ms)
752:  error handling
753:  ✓ shows alert when no model is selected and trying to send (7 ms)
...

770:  scroll handling
771:  ✓ renders FlatList with scroll handler when messages exist (16 ms)
772:  model loading state
773:  ✓ shows loading indicator when model is loading (via internal state) (10 ms)
774:  queue management
775:  ✓ registers queue processor on mount (12 ms)
776:  ✓ clears queue processor on unmount (14 ms)
777:  image generation routing
778:  ✓ routes to image generation in force mode (49 ms)
779:  ✓ routes to text when image generation is already in progress (19 ms)
780:  classifying intent
781:  ✓ message is added to conversation when sent in auto mode with image model (46 ms)
782:  ✓ sends message in manual mode without force image (30 ms)
783:  ✓ does not route to image when no image model is active (35 ms)
784:  copy message
785:  ✓ handles copy message action without error (19 ms)
786:  keyboard handling
...

794:  system messages with showGenerationDetails
795:  ✓ skips system message when showGenerationDetails is false (22 ms)
796:  handleModelSelect early return
797:  ✓ closes selector when selecting already-loaded model (47 ms)
798:  handleModelSelect memory check
799:  ✓ shows insufficient memory alert when canLoad is false (49 ms)
800:  ✓ shows warning with Load Anyway option when severity is warning (64 ms)
801:  proceedWithModelLoad
802:  ✓ loads model and creates conversation when none exists (553 ms)
803:  handleUnloadModel during streaming
804:  ✓ unloads model via selector (554 ms)
805:  shouldRouteToImageGeneration manual mode
806:  ✓ generates image when forceImageMode=true in manual mode (568 ms)
807:  LLM intent classification
808:  ✓ classifies intent with LLM method and routes to image (556 ms)
809:  ✓ falls back to text when intent classification fails (51 ms)
810:  document attachment handling
811:  ✓ appends document content to message text (52 ms)
812:  image requested but no model
813:  ✓ prepends note when image requested but no image model loaded (38 ms)
814:  model reload during generation
815:  ✓ shows error when model fails to load during generation (321 ms)
816:  context debug and cache clearing
817:  ✓ clears cache when context usage is high (154 ms)
818:  delete conversation while streaming
819:  ✓ shows delete confirmation and deletes conversation (60 ms)
820:  regenerateResponse with image routing
821:  ✓ regenerates as image when intent is image (583 ms)
822:  handleSend without model
823:  ✓ shows alert when no active conversation and no model (8 ms)
824:  generation error handling
825:  ✓ shows alert when generation service throws (81 ms)
826:  gallery navigation
827:  ✓ navigates to Gallery from settings when images exist (35 ms)
828:  animation tracking
829:  ✓ tracks new message animations (34 ms)
830:  model loading screen
831:  ✓ does not show the loading bar on chat open (load deferred to send) (517 ms)
832:  ensureModelLoaded memory check
833:  ✓ does not run the memory check or alert on chat open (load deferred to send) (326 ms)
834:  image generation failure
835:  ✓ shows error alert when image generation fails (51 ms)
836:  settings from input
837:  ✓ opens settings panel from input button (37 ms)
838:  handleImageGeneration without model
839:  ✓ shows error when no image model is active (51 ms)
840:  project hint
841:  ✓ shows project initial in empty chat (59 ms)
842:  save image error
843:  ✓ handles save image failure gracefully (73 ms)
844:  generation ref cleared on conversation switch
...

847:  ✓ preloads classifier model when conditions are met (performance mode + LLM + no model loaded) (22 ms)
848:  ✓ does not preload classifier when model is already loaded (39 ms)
849:  handleScroll shows scroll-to-bottom button
850:  ✓ shows scroll-to-bottom button when user is far from bottom (90 ms)
851:  addSystemMessage after model load with showGenerationDetails
852:  ✓ does not load the model on chat open when showGenerationDetails is true (load deferred to send) (116 ms)
853:  Load Anyway button in memory warning alert
854:  ✓ pressing Load Anyway dismisses alert and proceeds with model load (571 ms)
855:  proceedWithModelLoad with no active conversation
856:  ✓ does not create a conversation when model loads and no conversation exists (561 ms)
857:  handleUnloadModel while streaming
858:  ✓ stops generation before unloading when streaming is active (258 ms)
859:  ✓ exercises showGenerationDetails branch when unloading model (596 ms)
860:  shouldRouteToImageGeneration LLM path with text result
861:  ✓ clears image generation status when LLM classifies as text (261 ms)
862:  handleImageGeneration shows error when no image model
863:  ✓ shows error alert from handleGenerateImageFromMessage when no image model (38 ms)
864:  handleSend alert when conversation exists but model missing
865:  ✓ shows No Model Selected alert when conversation exists but activeModel is null (4 ms)
866:  startGeneration fails when model cannot load
867:  ✓ exercises startGeneration path when model reload fails (890 ms)
868:  getContextDebugInfo error is silently caught
869:  ✓ continues generation even when context debug info throws (581 ms)
870:  generateResponse error shows alert
871:  ✓ shows Generation Error alert when generateResponse throws (614 ms)
872:  handleDeleteConversation while streaming
873:  ✓ stops generation before deleting conversation while streaming (48 ms)
874:  image generation failed alert shown
875:  ✓ exercises image generation failure path (line 625-626) (82 ms)
876:  clear queue button
877:  ✓ calls generationService.clearQueue when clear queue button is pressed (33 ms)
878:  project hint tap opens selector
879:  ✓ opens project selector when tapping project hint in empty chat (27 ms)
880:  image viewer backdrop tap closes viewer
881:  ✓ closes image viewer when backdrop is tapped (51 ms)
882:  gallery navigation from settings modal
883:  ✓ navigates to Gallery when open gallery button is pressed (51 ms)
884:  model loading screen vision hint
885:  ✓ does not load a vision model on chat open (load deferred to send) (116 ms)
886:  ensureModelLoaded already correctly loaded
887:  ✓ sets vision support from current loaded model without reloading (120 ms)
888:  proceedWithModelLoad error handling
889:  ✓ shows error alert when proceedWithModelLoad fails (559 ms)
890:  handleUnloadModel error handling
891:  ✓ shows error alert when unload fails (339 ms)
892:  vision support useEffect
...

895:  ✓ shows model selector modal from no-model screen (8 ms)
896:  proceedWithModelLoad with showGenerationDetails and existing conversation
897:  ✓ adds system message after model load when showGenerationDetails is enabled (643 ms)
898:  pending settings warning
899:  ✓ shows warning when settings have changed but model not reloaded (78 ms)
900:  ✓ does not show warning when settings match loaded settings (28 ms)
901:  ✓ does not show warning when no model is loaded (9 ms)
902:  PASS __tests__/integration/models/activeModelService.test.ts
903:  ActiveModelService Integration
904:  Text Model Loading
905:  ✓ should load text model via llmService and update store (3 ms)
906:  ✓ should save loadedSettings when model is loaded
907:  ✓ should save loadedSettings with flash attention enabled (1 ms)
908:  ✓ should skip loading if model already loaded
909:  ✓ should unload previous model when loading different model (1 ms)
910:  ✓ should throw error if model not found (11 ms)
911:  ✓ should notify listeners during loading state changes (3 ms)
...

931:  ✓ should sync internal state with native module state
932:  ✓ should clear internal state if native reports no model loaded
933:  Performance Stats
934:  ✓ should proxy performance stats from llmService (1 ms)
935:  Active Models Info
936:  ✓ should return correct info about the loaded model (1 ms)
937:  ✓ should report no models when none loaded (1 ms)
938:  Has Any Model Loaded
939:  ✓ should return true when text model loaded
940:  ✓ should return true when image model loaded
941:  ✓ should return false when no models loaded
942:  Concurrent Load Prevention
943:  ✓ should wait for pending load to complete before starting new load (3 ms)
944:  unloadImageModel when no model loaded
945:  ✓ should skip unload when all sources say no model
946:  unloadAllModels error handling
947:  ✓ should continue unloading image model when text unload fails (1 ms)
948:  getResourceUsage
949:  ✓ returns memory usage information (1 ms)
950:  checkMemoryForModel with image type
951:  ✓ checks memory for image model with correct overhead
952:  checkMemoryForDualModel with null IDs
953:  ✓ handles null text model ID
954:  ✓ handles null image model ID
955:  clearTextModelCache
956:  ✓ delegates to llmService.clearKVCache (1 ms)
957:  loadTextModel timeout
958:  ✓ should throw timeout error when loading takes too long (65 ms)
959:  loadTextModel with vision model mmproj detection
960:  ✓ should detect mmproj file for vision model (2 ms)
961:  loadTextModel error resets state
962:  ✓ should clear loadedTextModelId on load failure (23 ms)
963:  loadImageModel error resets state
964:  ✓ should clear loadedImageModelId on load failure (3 ms)
965:  loadImageModel not found
...

973:  ✓ counts text model memory when checking image model
974:  checkMemoryForModel critical with other models message
975:  ✓ includes other models in critical message (1 ms)
976:  checkMemoryForDualModel warning and critical paths
977:  ✓ returns warning when dual model exceeds 50% RAM
978:  ✓ returns critical when dual models exceed budget
979:  syncWithNativeState with image model
980:  ✓ syncs image model internal state from store (1 ms)
981:  ✓ clears image model internal state when native reports not loaded
982:  unloadTextModel with store but no native
983:  ✓ clears store even when native is not loaded
984:  unloadImageModel with store but no native
985:  ✓ clears store even when native is not loaded (1 ms)
986:  loadTextModel vision model no mmproj found
987:  ✓ logs warning when no mmproj file found in directory
988:  loadTextModel vision model mmproj search failure
989:  ✓ catches error when readDir fails (1 ms)
990:  loadTextModel mmproj found updates store with multiple models
991:  ✓ only updates the matching model in store
992:  unloadTextModel waits for pending load
993:  ✓ waits for pending textLoadPromise before unloading (3 ms)
994:  unloadImageModel waits for pending load
995:  ✓ waits for pending imageLoadPromise before unloading (5 ms)
996:  loadImageModel already loaded but needs thread reload
997:  ✓ reloads when imageThreads changed (1 ms)
998:  loadImageModel concurrent load - different model
999:  ✓ loads new model after pending load for different model completes (2 ms)
1000:  unloadAllModels error handling - image unload fails
1001:  ✓ handles image unload error gracefully (1 ms)
1002:  loadImageModel with coreml backend
...

1047:  ✓ returns true for active conversation during generation (3 ms)
1048:  ✓ returns false for different conversation during generation
1049:  subscribe
1050:  ✓ immediately calls listener with current state (1 ms)
1051:  ✓ returns unsubscribe function (2 ms)
1052:  ✓ unsubscribe removes listener (1 ms)
1053:  ✓ multiple listeners receive updates
1054:  generateResponse
1055:  ✓ throws when no model loaded (60 ms)
1056:  ✓ returns immediately when already generating (3 ms)
1057:  ✓ sets isThinking true initially (5 ms)
1058:  ✓ calls chatStore.startStreaming (1 ms)
1059:  ✓ accumulates streaming tokens (5 ms)
1060:  ✓ calls onFirstToken callback on first token (3 ms)
1061:  ✓ finalizes message on completion (1 ms)
1062:  ✓ handles generation error (24 ms)
1063:  ✓ throws error on generation failure (1 ms)
1064:  stopGeneration
1065:  ✓ always attempts to stop native generation (1 ms)
1066:  ✓ returns empty string when not generating (1 ms)
1067:  ✓ saves partial content when stopped (55 ms)
1068:  ✓ clears streaming message when no content (3 ms)
1069:  ✓ resets state after stopping (55 ms)
1070:  ✓ handles stopGeneration error gracefully
1071:  queue management
...

1075:  ✓ clearQueue removes all items
1076:  ✓ notifies listeners on queue changes
1077:  queue processor
1078:  ✓ setQueueProcessor registers callback (1 ms)
1079:  ✓ setQueueProcessor with null clears callback (1 ms)
1080:  ✓ processNextInQueue aggregates multiple messages (13 ms)
1081:  ✓ processNextInQueue passes single message directly (13 ms)
1082:  ✓ processNextInQueue does nothing without processor (1 ms)
1083:  abort handling
1084:  ✓ ignores tokens after abort is requested (51 ms)
1085:  store integration
1086:  ✓ updates chatStore streaming state during generation (1 ms)
1087:  ✓ includes generation metadata on finalized message (1 ms)
1088:  remote provider
1089:  ✓ routes to remote provider when activeServerId is set (2 ms)
1090:  ✓ throws error when remote provider is not found (1 ms)
1091:  ✓ throws error when remote provider is not ready (1 ms)
1092:  ✓ handles remote generation error (6 ms)
1093:  ✓ tracks time to first token for remote generation (13 ms)
1094:  ✓ stops remote generation on abort (14 ms)
1095:  ✓ handles onReasoning callback for remote generation (3 ms)
1096:  ✓ uses remote metadata in generation meta (1 ms)
1097:  buildGenerationMeta
1098:  ✓ includes GPU info for local generation (1 ms)
1099:  share prompt check
1100:  ✓ does not trigger share prompt if already engaged (1 ms)
1101:  reasoning content in local generateResponse
1102:  ✓ accumulates reasoning content in reasoningBuffer (1 ms)
1103:  error path clears flushTimer
1104:  ✓ clearTimeout on flushTimer when generation throws with buffered tokens (2 ms)
1105:  generateWithTools — local path via runToolLoop
1106:  ✓ runs tool loop and finalizes on success (1 ms)
1107:  ✓ calls onStreamReset to flush pending content (1 ms)
1108:  ✓ calls onFinalResponse to set streaming content
1109:  ✓ throws and clears state on runToolLoop error (2 ms)
1110:  ✓ throws and clears flushTimer on error if timer was set (15 ms)
1111:  resetState with queued items triggers processNextInQueue
1112:  ✓ schedules processNextInQueue when queue is non-empty after reset (3 ms)
1113:  checkSharePrompt — triggers share
1114:  ✓ calls emitSharePrompt when shouldShowSharePrompt returns true (1 ms)
1115:  stopGeneration — edge cases
1116:  ✓ clears streaming when there is no content on stop
1117:  ✓ aborts remote controller when not generating and controller exists (1 ms)
1118:  ✓ returns streamingContent when stopping remote generation
1119:  generateWithTools — remote path via generateRemoteWithTools
1120:  ✓ routes generateWithTools to generateRemoteWithTools and calls runToolLoop with forceRemote (1 ms)
1121:  ✓ throws when remote provider not found in generateRemoteWithTools
1122:  ✓ finalizes after remote tool loop when not aborted
1123:  generateRemoteResponse — error updates server health
1124:  ✓ marks server offline when provider.generate throws (1 ms)
...

1136:  normalizeStreamChunk
1137:  ✓ wraps string data as content object (1 ms)
1138:  ✓ passes through object data unchanged
1139:  buildToolLoopHandlers — onStream abort guard
1140:  ✓ returns early from onStream when abortRequested is true
1141:  ✓ accumulates reasoning content in reasoningBuffer via onStream
1142:  isUsingRemoteProvider — local model wins when loaded
1143:  ✓ uses local LLM when local model is loaded even if remote server is configured (5 ms)
1144:  buildToolLoopHandlers — isAborted and timer flush
1145:  ✓ isAborted returns the current abortRequested value (1 ms)
1146:  ✓ onStream schedules flushTokenBuffer via setTimeout and fires on advance (2 ms)
1147:  generateRemoteWithTools — no provider available
1148:  ✓ getCurrentProvider returns local provider fallback when no activeServerId
1149:  resetState — flushTimer cleanup
1150:  ✓ clears flushTimer in resetState when timer is set
1151:  generateRemoteResponse — flushTimer in error paths
1152:  ✓ clears flushTimer in catch block when timer was set by onToken (1 ms)
1153:  ✓ clears flushTimer in onError callback when timer was set by onToken (1 ms)
1154:  ✓ triggers onReasoning flush timer path (1 ms)
...

1160:  ✓ returns false when already generating image (3 ms)
1161:  ✓ returns forceImageMode===true when mode is manual (2 ms)
1162:  ✓ returns true immediately when forceImageMode and imageModelLoaded (1 ms)
1163:  ✓ returns false when imageModelLoaded is false (1 ms)
1164:  ✓ with no text model, routes a chat request to text (heuristics) (1 ms)
1165:  ✓ with no text model, routes an image request to image (heuristics)
1166:  ✓ with no text model but a classifier configured, uses the SMOL LLM (1 ms)
1167:  ✓ classifies intent via LLM when autoDetectMethod=llm (1 ms)
1168:  ✓ resets image status when LLM returns non-image intent (1 ms)
1169:  ✓ returns false and resets state when classification throws (2 ms)
1170:  handleImageGenerationFn
1171:  ✓ shows alert when no image model loaded (1 ms)
1172:  ✓ adds user message when skipUserMessage is false (default)
1173:  ✓ keeps attachments (e.g. a voice note) on the user message in the image route
1174:  ✓ skips user message when skipUserMessage=true (1 ms)
1175:  ✓ shows alert when image generation returns null and there is a non-cancel error
1176:  ✓ does not show alert when error is "cancelled"
1177:  executeDeleteConversationFn
...

1214:  ✓ treats an unset gate as allowed (backward compatible)
1215:  ✓ regenerate also honours the UI tool gate
1216:  RAG context injection in startGenerationFn
1217:  ✓ injects doc list and RAG context when conversation has a projectId and search returns chunks (1 ms)
1218:  ✓ injects doc list even when BM25 returns no chunks
1219:  ✓ does not inject RAG context when conversation has no projectId (1 ms)
1220:  ✓ does not inject doc list when all docs are disabled
1221:  ✓ continues generation even if RAG search throws
1222:  ✓ auto-enables search_knowledge_base tool for project conversations
1223:  RAG context injection in regenerateResponseFn
1224:  ✓ injects RAG context for project conversations (1 ms)
1225:  ✓ skips RAG for non-project conversations
1226:  embedding model warmup in injectRagContext
1227:  ✓ fires embeddingService.load() when project has enabled docs and model is not loaded
1228:  ✓ does not call load() when embedding model is already loaded (1 ms)
1229:  ✓ does not block generation if embedding load fails
1230:  ✓ does not fire warmup when no enabled docs exist
1231:  handleSelectProjectFn
1232:  ✓ sets conversation project when activeConversationId is set
1233:  ✓ clears project when project is null (1 ms)
1234:  ✓ skips setConversationProject when no activeConversationId
1235:  handleSendFn — additional branches
1236:  ✓ appends document attachment content to message text
1237:  ✓ ignores attachments without textContent (1 ms)
1238:  ✓ enqueues message when generation is already in progress
1239:  ✓ prefixes message when shouldGenerateImage=true but no image model loaded
1240:  startGenerationFn — remote model path
1241:  ✓ skips local model loading for remote models
1242:  ✓ uses all tools when remote server is active (bypasses heuristic) (1 ms)
1243:  regenerateResponseFn — model not loaded
1244:  ✓ returns early when local model is not loaded
1245:  ✓ does not return early for remote models even if local model is not loaded
1246:  generateWithCompactionRetry — context full error path
1247:  ✓ rethrows non-context-full errors
1248:  ✓ retries with compacted messages on context full error
1249:  ✓ falls back to recent messages when compact throws (1 ms)
...

1284:  ✓ shows alert when toggling without image model loaded (12 ms)
1285:  ✓ cycles through auto -> force -> disabled -> auto (26 ms)
1286:  ✓ quick settings button is always visible regardless of props (6 ms)
1287:  vision capabilities
1288:  ✓ shows attach button when supportsVision is true (5 ms)
1289:  ✓ shows attach button even when supportsVision is false (5 ms)
1290:  ✓ shows alert when pressing photo without vision support (29 ms)
1291:  ✓ opens image picker when pressing photo with vision support (16 ms)
1292:  ✓ attach button is present when vision is supported (6 ms)
1293:  attachments
1294:  ✓ shows custom alert when photo is pressed via attach picker (18 ms)
1295:  ✓ shows attachment preview after selecting image (351 ms)
1296:  ✓ can send message with attachment (355 ms)
1297:  ✓ renders attach button always (5 ms)
1298:  ✓ opens document picker when document is pressed via attach picker (67 ms)
1299:  ✓ shows error alert for unsupported file types (68 ms)
1300:  ✓ does nothing when document picker is cancelled (15 ms)
1301:  ✓ shows document preview with file icon after picking document (65 ms)
1302:  ✓ sends message with document attachment (72 ms)
1303:  ✓ shows error alert when processDocumentFromPath fails (69 ms)
1304:  ✓ handles processDocumentFromPath returning null (13 ms)
...

1333:  ✓ renders and handles stop button when onStop is provided (3 ms)
1334:  send with attachment but no text
1335:  ✓ shows send button when only attachments are present (339 ms)
1336:  disabled does not send with attachment
1337:  ✓ does not call onSend when disabled even with attachments (4 ms)
1338:  voice recording integration
1339:  ✓ starts recording and tracks conversationId (4 ms)
1340:  ✓ inserts transcribed text into message when finalResult arrives (10 ms)
1341:  ✓ appends transcribed text to existing message (17 ms)
1342:  ✓ clears pending transcription when conversation changes (9 ms)
1343:  ✓ calls stopRecording and clearResult on cancel recording (5 ms)
1344:  image mode toggle alert when no model loaded
1345:  ✓ shows alert when toggling image mode without loaded model (21 ms)
1346:  camera capture flow
1347:  ✓ picks image from camera when Camera option is pressed (45 ms)
1348:  ✓ handles camera error gracefully (28 ms)
1349:  ✓ handles camera returning no assets (35 ms)
1350:  photo library error
1351:  ✓ handles photo library error gracefully (43 ms)
1352:  document picker error without message
1353:  ✓ shows fallback error message when error has no message (69 ms)
1354:  voice recording without conversationId
...

1398:  ✓ shows import button (52 ms)
1399:  ✓ triggers file picker on import press (51 ms)
1400:  recommended models
1401:  ✓ RECOMMENDED_MODELS has entries (1 ms)
1402:  ✓ all recommended models have minRam
1403:  ✓ all recommended models have type badges (text/vision/code) (1 ms)
1404:  ✓ recommended models have editorial ordering with Gemma 4 first
1405:  ✓ MODEL_ORGS contains expected organizations
1406:  type filter
1407:  ✓ filters by text models (1 ms)
1408:  ✓ filters by vision models
1409:  ✓ has no code models after removal
1410:  multi-file download
1411:  ✓ vision model files include mmProjFile
1412:  ✓ calculates combined size for vision model files (1 ms)
1413:  search error handling
1414:  ✓ handles search network error gracefully (46 ms)
1415:  text filter bar
...

1471:  ✓ shows image filter toggle on image tab (84 ms)
1472:  ✓ renders device recommendation banner on image tab (91 ms)
1473:  import progress
1474:  ✓ shows import progress card when importing (79 ms)
1475:  tab switching resets state
1476:  ✓ resets text filters when switching to image tab (143 ms)
1477:  model type detection
1478:  ✓ detects code models from tags (640 ms)
1479:  ✓ detects image-gen models from diffusion tags (599 ms)
1480:  file compatibility
1481:  ✓ hides models with files too large for device RAM (583 ms)
1482:  ✓ shows models with no file info (files not yet fetched) (575 ms)
1483:  recommended models with filters
1484:  ✓ filters recommended models by type filter (126 ms)
1485:  ✓ hides recommended models that are already downloaded (65 ms)
1486:  search error display
1487:  ✓ handles API error gracefully during search (73 ms)
1488:  detail view navigation
...

1507:  handleSearch with active filters
1508:  ✓ triggers HuggingFace search when vision type filter is set and query is empty (83 ms)
1509:  ✓ does not trigger HuggingFace search when query is empty and no filters are active (68 ms)
1510:  ✓ triggers HuggingFace search with "coder" keyword when code filter is set and query is empty (139 ms)
1511:  formatNumber display
1512:  ✓ shows formatted download count in detail view (628 ms)
1513:  PASS __tests__/integration/generation/imageGenerationFlow.test.ts
1514:  Image Generation Flow Integration
1515:  Image Generation Lifecycle
1516:  ✓ should update state during generation lifecycle (2 ms)
1517:  ✓ should call localDreamGeneratorService with correct parameters
1518:  ✓ should save generated image to gallery (1 ms)
1519:  ✓ should add message to chat when conversationId is provided (2 ms)
1520:  Progress Updates
1521:  ✓ should receive and propagate progress updates (1 ms)
1522:  Error Handling
1523:  ✓ should handle generation errors gracefully (1 ms)
1524:  ✓ should return null when no model is selected
1525:  ✓ should handle model load failure (1 ms)
1526:  Cancel Generation
...

1538:  Prompt Enhancement with Conversation Context
1539:  ✓ should pass conversation history to enhancement when conversationId provided (5 ms)
1540:  ✓ should not include conversation context when no conversationId (1 ms)
1541:  ✓ should truncate long messages in conversation context (1 ms)
1542:  ✓ should limit conversation context to last 10 messages (1 ms)
1543:  ✓ should skip system messages from conversation context (1 ms)
1544:  ✓ should use original prompt when enhancement is disabled (1 ms)
1545:  ✓ should handle empty conversation gracefully (1 ms)
1546:  cancelGeneration when not generating
1547:  ✓ should return immediately when not generating
1548:  isGeneratingFor
1549:  ✓ returns false when not generating
1550:  ✓ returns true when generating for matching conversation (2 ms)
1551:  generation returning null result (no imagePath)
1552:  ✓ should return null when native generator returns null (1 ms)
1553:  prompt enhancement error handling
1554:  ✓ should fall back to original prompt when enhancement fails (1 ms)
1555:  ✓ should skip enhancement when LLM is not loaded
1556:  enhancement result update vs delete thinking message
1557:  ✓ should update thinking message when enhancement produces different prompt (1 ms)
1558:  ✓ should delete thinking message when enhancement returns same prompt (1 ms)
1559:  generation with conversation metadata
1560:  ✓ should include correct backend metadata for QNN model
1561:  cancelRequested during generation
1562:  ✓ should check cancelRequested after model load (1 ms)
1563:  generation without conversationId
1564:  ✓ should save to gallery but not add chat message
1565:  enhancement with LLM currently generating
1566:  ✓ should still attempt enhancement even if LLM was generating (1 ms)
1567:  prompt enhancement strips thinking model tags
1568:  ✓ should strip <think> tags from thinking model responses (1 ms)
1569:  ✓ should handle thinking model response that is only a think block
1570:  ✓ should handle response without think tags normally (3 ms)
1571:  cancelled error handling
1572:  ✓ should reset state when error message includes cancelled
1573:  prompt enhancement stopGeneration cleanup (lines 247, 287-291)
1574:  ✓ should call stopGeneration after successful enhancement (line 247) (1 ms)
1575:  ✓ should call stopGeneration even when stopGeneration itself throws (lines 253-255)
1576:  ✓ should delete thinking message and call stopGeneration when enhancement fails with conversationId (lines 287-298) (1 ms)
1577:  ✓ should call stopGeneration in catch when stopGeneration itself throws during error cleanup (lines 290-292) (1 ms)
1578:  ✓ should update thinking message in chat when enhancement succeeds with conversationId (lines 263-278) (1 ms)
1579:  ✓ should delete thinking message when enhancement returns same prompt as original (lines 274-278) (2 ms)
1580:  onPreview callback normal path (lines 388-389)
1581:  ✓ should update previewPath state when onPreview fires without cancellation (1 ms)
1582:  onPreview callback skipped when cancelRequested (lines 387-389)
1583:  ✓ should skip preview update when cancelRequested is true during preview callback (4 ms)
1584:  cancelRequested check after generateImage resolves (lines 397-398)
1585:  ✓ should return null when cancelRequested is set before generateImage resolves (4 ms)
1586:  OpenCL kernel cache branches
1587:  ✓ logs warning and sets isFirstGpuRun=false when hasKernelCache throws (1 ms)
1588:  ✓ uses regular progress status when kernel cache exists (isFirstGpuRun=false) (1 ms)
1589:  _ensureImageModelLoaded with null activeImageModelId
1590:  ✓ returns false and sets error when activeImageModelId is null but model not loaded
1591:  PASS __tests__/rntl/screens/HomeScreen.test.tsx (8.378 s)
...

1655:  ✓ shows "Unload current model" when image model is active (26 ms)
1656:  ✓ shows model item for active text model (27 ms)
1657:  ✓ closes picker when close button pressed (32 ms)
1658:  ✓ shows "Browse more models" link in picker (28 ms)
1659:  ✓ navigates to ModelsTab when "Browse more models" pressed (31 ms)
1660:  ✓ shows memory estimate per model in picker (28 ms)
1661:  ✓ shows vision indicator for vision models in picker (30 ms)
1662:  model selection from picker
1663:  ✓ marks text model active without loading or checking memory (31 ms)
1664:  ✓ marks image model active without loading or checking memory (34 ms)
1665:  ✓ does not show a memory dialog when selecting a text model (85 ms)
1666:  ✓ closes the picker after selecting a text model (31 ms)
1667:  model unloading from picker
1668:  ✓ unloads text model when unload button pressed in picker (44 ms)
1669:  ✓ unloads image model when unload button pressed in picker (59 ms)
1670:  ✓ shows error alert when text model unload fails (67 ms)
1671:  ✓ shows error alert when image model unload fails (63 ms)
1672:  model load error handling
1673:  ✓ shows error when eject all fails (424 ms)
1674:  delete conversation
...

1681:  ✓ shows RAM estimates in both pickers when both models loaded (44 ms)
1682:  ✓ renders without crashing when both models loaded (14 ms)
1683:  delete conversation full flow
1684:  ✓ renders delete button in swipeable right actions (12 ms)
1685:  ✓ shows delete confirmation and deletes conversation (29 ms)
1686:  ✓ cancels delete conversation (26 ms)
1687:  gallery navigation
1688:  ✓ navigates to Gallery when gallery card is pressed (10 ms)
1689:  empty picker browse navigation
1690:  ✓ navigates to ModelsTab from empty text picker Browse Models button (31 ms)
1691:  ✓ navigates to ModelsTab from empty image picker Browse Models button (54 ms)
1692:  formatDate coverage
1693:  ✓ shows "Yesterday" for conversations updated yesterday (29 ms)
1694:  ✓ shows weekday name for conversations updated 2-6 days ago (26 ms)
1695:  ✓ shows month and day for conversations updated more than 7 days ago (34 ms)
1696:  memory info error handling
1697:  ✓ handles getResourceUsage failure gracefully (72 ms)
1698:  ✓ refreshes memory info when subscribe callback fires (38 ms)
...

2228:  ✓ should parse multiple SSE events
2229:  ✓ should handle multi-line data (1 ms)
2230:  ✓ should handle events without explicit event type
2231:  ✓ should throw when body is not readable (21 ms)
2232:  ✓ should handle events with id field (1 ms)
2233:  ✓ should handle data as object type
2234:  ✓ should handle chunked data correctly
2235:  ✓ should handle event with id field (1 ms)
2236:  ✓ should throw when response body is not readable (4 ms)
2237:  ✓ should handle events with only data field
2238:  ✓ should skip events without data (1 ms)
2239:  ✓ should yield remaining event at end of stream
2240:  parseOpenAIMessage
2241:  ✓ should parse content delta
2242:  ✓ should parse [DONE] marker
2243:  ✓ should parse error messages
2244:  ✓ should parse tool calls
...

2250:  ✓ should return null for empty data
2251:  isPrivateNetworkEndpoint
2252:  ✓ should detect localhost as private
2253:  ✓ should detect 192.168.x.x as private (1 ms)
2254:  ✓ should detect 10.x.x.x as private
2255:  ✓ should detect 172.16-31.x.x as private
2256:  ✓ should NOT detect 172.15.x.x as private
2257:  ✓ should NOT detect 172.32.x.x as private
2258:  ✓ should detect link-local 169.254.x.x as private
2259:  ✓ should detect .local (mDNS) as private
2260:  ✓ should detect public internet as NOT private
2261:  ✓ should handle invalid URLs (1 ms)
2262:  fetchWithTimeout
2263:  ✓ should resolve with JSON response
2264:  ✓ should resolve with text response for non-JSON
2265:  ✓ should throw on HTTP error (8 ms)
2266:  ✓ should timeout after specified duration (1 ms)
2267:  ✓ should retry on transient errors (3 ms)
2268:  ✓ should throw "Request cancelled" on AbortError
2269:  ✓ should fallback to text when content-type header is missing
2270:  ✓ should fallback to "Unknown error" when response.text() fails
2271:  ✓ should handle non-Error thrown values
2272:  testEndpoint
2273:  ✓ should return success for reachable endpoint
2274:  ✓ should return error for unreachable endpoint
2275:  ✓ should return error on HTTP error (1 ms)
2276:  ✓ should try alternate health endpoints when /v1/models fails
2277:  ✓ should strip trailing slashes from endpoint
2278:  imageToBase64DataUrl
2279:  ✓ should return data URL as-is if already encoded
2280:  ✓ should encode file:// URI to base64
2281:  ✓ should throw if file does not exist (4 ms)
2282:  ✓ should determine MIME type from extension
2283:  ✓ should default to jpeg for unknown extensions
2284:  ✓ should handle paths without file:// prefix
2285:  ✓ should fetch and encode remote URLs (5 ms)
2286:  ✓ should throw on fetch failure
2287:  ✓ should throw on FileReader error (3 ms)
2288:  detectServerType
2289:  ✓ should detect Ollama from server header
2290:  ✓ should detect Ollama from /api/tags endpoint
2291:  ✓ should detect LM Studio from model list
2292:  ✓ should detect generic OpenAI-compatible server (1 ms)
2293:  ✓ should return null when server type cannot be determined
2294:  ✓ should return null on network error
2295:  ✓ should strip trailing slashes from endpoint
2296:  ✓ should fallback to Ollama when OpenAI-compatible check fails
2297:  createStreamingRequest
2298:  ✓ should make POST request with correct headers (1 ms)
2299:  ✓ should parse SSE events on progress (1 ms)
2300:  ✓ should resolve on successful completion
2301:  ✓ should reject on HTTP error (1 ms)
2302:  ✓ should reject on network error
2303:  ✓ should reject on timeout (4 ms)
2304:  ✓ should handle events with event type
2305:  ✓ should handle events with id field (1 ms)
2306:  ✓ should handle multi-line data
2307:  ✓ should process final chunk on completion
2308:  ✓ should handle incremental progress updates (1 ms)
2309:  ✓ should handle events with id in final chunk
2310:  ✓ should handle multi-line data in final chunk
2311:  ✓ should handle events with event type in final chunk (1 ms)
2312:  ✓ should handle XHR timeout event
2313:  ✓ should handle XHR timeout via ontimeout
2314:  ✓ should reject on send error
2315:  ✓ should abort XHR when signal fires (1 ms)
2316:  ✓ should not process final data when responseText equals processed length
2317:  detectServerType — additional branches
2318:  ✓ returns null when JSON parse throws for /v1/models response
2319:  ✓ returns null when LM Studio response has no gguf models
2320:  ✓ handles generic OpenAI-compatible via Array.isArray(data.data) branch
2321:  parseAnthropicMessage — non-string data
2322:  ✓ returns null for non-string data (1 ms)
2323:  ✓ returns null for invalid JSON
2324:  createNDJSONStreamingRequest
2325:  ✓ resolves and calls onLine for each complete NDJSON line
2326:  ✓ flushes partial buffered line on readyState=4
2327:  ✓ rejects on HTTP error status (1 ms)
2328:  ✓ rejects on network error
2329:  ✓ rejects on timeout
...

2331:  ✓ warns and skips invalid JSON lines
2332:  ✓ sets custom headers (1 ms)
2333:  ✓ processes onprogress chunks and merges partial lines
2334:  ✓ warns and skips invalid JSON in buffered final line
2335:  ✓ rejects when xhr.send throws (1 ms)
2336:  PASS __tests__/unit/services/parallelMmproj.test.ts
2337:  Parallel mmproj download
2338:  performBackgroundDownload
2339:  ✓ starts both main and mmproj downloads in parallel (1 ms)
2340:  ✓ persists mmProjDownloadId in metadata callback
2341:  ✓ sets mmProjCompleted=false and mainCompleted=false in context
2342:  ✓ skips mmproj download when mmproj already exists (1 ms)
2343:  ✓ only starts main download for non-vision models
2344:  ✓ returns immediately when both files already exist
2345:  ✓ re-downloads mmproj when an existing sidecar is only partially written (1 ms)
2346:  ✓ re-downloads mmproj when stat fails for an existing sidecar
2347:  ✓ reuses an existing failed entry by cancelling old downloads and retrying the store entry (1 ms)
2348:  combined progress
2349:  ✓ reports combined progress from both downloads (1 ms)
2350:  ✓ includes pre-existing mmproj size in progress when mmproj already downloaded
2351:  ✓ updates the native combined-progress notification when supported (1 ms)
2352:  ✓ swallows native combined-progress update failures
2353:  watchBackgroundDownload — completion gating
2354:  ✓ does not fire onComplete until both downloads finish (mmproj first) (1 ms)
2355:  ✓ does not fire onComplete until both downloads finish (main first)
2356:  ✓ fires onComplete immediately for non-vision model (no mmproj)
2357:  ✓ moves mmproj file on mmproj completion (1 ms)
2358:  ✓ clears metadata callback when both complete
2359:  ✓ ignores duplicate main completion events after the first one
2360:  ✓ drops vision when mmproj move fails and the target file is missing
2361:  watchBackgroundDownload — error handling
2362:  ✓ cancels mmproj when main download fails (1 ms)
2363:  ✓ preserves retry context and resets main finalization flags when main download fails
2364:  ✓ continues as text-only when mmproj download fails (4 ms)
2365:  watchBackgroundDownload — already-downloaded recovery
2366:  ✓ persists already-downloaded models before firing onComplete (1 ms)
2367:  ✓ still fires onComplete when persistence fails for already-downloaded models
2368:  ✓ surfaces an already-downloaded context error via onError (1 ms)
2369:  syncCompletedBackgroundDownloads
2370:  ✓ syncs completed model with mmproj download
2371:  ✓ skips sync when mmproj download is still running
2372:  ✓ cancels mmproj when main download failed
2373:  restoreInProgressDownloads — mmproj recovery
2374:  ✓ restores both main and mmproj progress listeners (1 ms)
2375:  ✓ handles mmproj completed while app was dead (1 ms)
2376:  ✓ marks mmproj as completed when it failed while app was dead
2377:  ✓ defers mmproj move to watchBackgroundDownload when file not yet on disk
2378:  ✓ does not create duplicate context for mmproj download ID
2379:  watchBackgroundDownload — catch-up paths
2380:  ✓ finalizes after mmproj was already completed before listener registration (1 ms)
2381:  ✓ continues without vision when catch-up mmproj move fails and target is missing (1 ms)
2382:  PASS __tests__/rntl/screens/ModelSettingsScreen.test.tsx
...

2596:  ✓ "draw an elephant" should classify as image
2597:  ✓ "draw the sunset" should classify as image
2598:  ✓ "paint a landscape" should classify as image
2599:  ✓ "paint me a portrait" should classify as image
2600:  ✓ "paint an abstract piece" should classify as image
2601:  ✓ "sketch a building" should classify as image (1 ms)
2602:  ✓ "sketch me a character" should classify as image
2603:  ✓ "sketch the mountain" should classify as image
2604:  Text Intent Patterns
2605:  Questions and explanations
2606:  ✓ "explain how photosynthesis works" should classify as text
2607:  ✓ "tell me about the French Revolution" should classify as text
2608:  ✓ "describe the water cycle" should classify as text
2609:  ✓ "what is machine learning" should classify as text (1 ms)
2610:  ✓ "what are the benefits of exercise" should classify as text
2611:  ✓ "what does this error mean" should classify as text
2612:  ✓ "what's the capital of France" should classify as text
2613:  ✓ "whats happening in the code" should classify as text
2614:  How questions
2615:  ✓ "how do I install node.js" should classify as text
2616:  ✓ "how does electricity work" should classify as text
2617:  ✓ "how to make pasta" should classify as text
2618:  ✓ "how can I improve my writing" should classify as text
2619:  ✓ "how would you solve this problem" should classify as text (1 ms)
2620:  ✓ "how should I structure my code" should classify as text
2621:  Why questions
2622:  ✓ "why is the sky blue" should classify as text
2623:  ✓ "why does water boil" should classify as text
2624:  ✓ "why do birds migrate" should classify as text
2625:  ✓ "why are leaves green" should classify as text
2626:  ✓ "why would this fail" should classify as text
2627:  When/Where/Who/Which questions
...

2674:  ✓ "draft a script for a video" should classify as text
2675:  ✓ "write an article about technology" should classify as text (2 ms)
2676:  ✓ "compose a post for social media" should classify as text
2677:  ✓ "write a message to the team" should classify as text
2678:  ✓ "draft a response to this email" should classify as text (1 ms)
2679:  Programming and code
2680:  ✓ "write code to sort an array" should classify as text
2681:  ✓ "create a function to validate email" should classify as text
2682:  ✓ "write a script to automate backups" should classify as text
2683:  ✓ "create a program to parse CSV" should classify as text
2684:  ✓ "write a sql query to get users" should classify as text
2685:  ✓ "create a regex for phone numbers" should classify as text
2686:  ✓ "code a simple calculator" should classify as text
2687:  ✓ "coding challenge solution" should classify as text (1 ms)
2688:  ✓ "programming in python" should classify as text
2689:  ✓ "debug this error" should classify as text
2690:  ✓ "debugging the crash" should classify as text
2691:  ✓ "fix the code that throws an error" should classify as text
2692:  ✓ "debug this bug in my app" should classify as text
...

2698:  ✓ "variable not defined" should classify as text
2699:  ✓ "array out of bounds" should classify as text (1 ms)
2700:  ✓ "object is null" should classify as text
2701:  ✓ "loop through items" should classify as text
2702:  ✓ "if statement not working" should classify as text
2703:  ✓ "javascript async await" should classify as text
2704:  ✓ "typescript interface" should classify as text
2705:  ✓ "python list comprehension" should classify as text (1 ms)
2706:  ✓ "java hashmap" should classify as text
2707:  ✓ "kotlin coroutines" should classify as text (7 ms)
2708:  ✓ "swift optionals" should classify as text (1 ms)
2709:  ✓ "c++ pointers" should classify as text (1 ms)
2710:  ✓ "rust ownership" should classify as text
2711:  ✓ "go goroutines" should classify as text (1 ms)
2712:  ✓ "ruby blocks" should classify as text
2713:  ✓ "import statement error" should classify as text
2714:  ✓ "export default component" should classify as text
2715:  ✓ "return value is undefined" should classify as text (1 ms)
2716:  ✓ "const vs let in javascript" should classify as text
2717:  ✓ "def function python" should classify as text
2718:  ✓ "fn main rust" should classify as text
2719:  ✓ "error: cannot find module" should classify as text
2720:  ✓ "TypeError: undefined is not a function" should classify as text
2721:  ✓ "exception thrown at line 42" should classify as text
...

2833:  ✓ picture in context of describing should be text
2834:  ✓ image in technical context should be text
2835:  ✓ render in code context should be text
2836:  Empty and edge case inputs
2837:  ✓ empty string should return text
2838:  ✓ whitespace only should return text
2839:  ✓ single word with no clear intent should return text
2840:  Case insensitivity
2841:  ✓ UPPERCASE should still match patterns
2842:  ✓ MixedCase should still match patterns
2843:  Whitespace handling
2844:  ✓ leading/trailing whitespace should be trimmed (5 ms)
2845:  Cache Behavior
2846:  ✓ should return cached result on repeat query
2847:  ✓ clearCache should reset the cache
2848:  ✓ should handle very long messages without errors
2849:  quickCheck
2850:  ✓ should return image for image patterns (1 ms)
2851:  ✓ should return text for text patterns
2852:  ✓ should return text for uncertain messages
2853:  ✓ should be synchronous
2854:  LLM Fallback
2855:  ✓ should not call LLM when useLLM is false
2856:  ✓ should return text default when pattern is uncertain and LLM disabled
2857:  ✓ should throw when LLM enabled but no model loaded (1 ms)
2858:  ✓ should use LLM classification when pattern is uncertain and LLM enabled
2859:  ✓ should return text when LLM responds NO (1 ms)
2860:  ✓ should handle LLM errors gracefully
2861:  Cache Eviction
...

2878:  ✓ returns empty array for plain conversational messages (1 ms)
2879:  PASS __tests__/unit/services/backgroundDownloadService.test.ts
2880:  BackgroundDownloadService
2881:  isAvailable
2882:  ✓ returns true on Android with native module present (3 ms)
2883:  ✓ returns true on iOS when native module is present (2 ms)
2884:  ✓ returns false when native module is null (4 ms)
2885:  startDownload
2886:  ✓ calls native module with correct params (3 ms)
2887:  ✓ returns pending status (2 ms)
2888:  ✓ defaults modelType to text and totalBytes to 0 when not provided (2 ms)
2889:  ✓ throws when not available (37 ms)
2890:  cancelDownload
2891:  ✓ delegates to native module (3 ms)
2892:  ✓ throws when not available (4 ms)
2893:  ✓ notifies error listeners with a user_cancelled event so awaiters can settle (2 ms)
2894:  ✓ rejects a downloadFileTo() promise as cancelled when its download is cancelled (3 ms)
2895:  ✓ synthesizes the cancellation even if the native cancel throws (5 ms)
2896:  getActiveDownloads
2897:  ✓ returns empty array when not available (3 ms)
2898:  ✓ maps native response to BackgroundDownloadInfo (3 ms)
2899:  moveCompletedDownload
2900:  ✓ delegates to native module (2 ms)
2901:  ✓ throws when not available (9 ms)
2902:  listener registration
2903:  ✓ onProgress registers and returns unsubscribe function (2 ms)
2904:  ✓ onComplete registers and returns unsubscribe function (2 ms)
2905:  ✓ onError registers and returns unsubscribe function (2 ms)
2906:  ✓ onAnyProgress registers global listener (2 ms)
2907:  ✓ onAnyComplete registers global listener (2 ms)
2908:  ✓ onAnyError registers global listener (1 ms)
2909:  event dispatching
2910:  ✓ dispatches progress to both specific and global listeners (2 ms)
2911:  ✓ dispatches progress to global listener when no per-download listene...

@dishit-wednesday dishit-wednesday changed the title feat(chat): summarize attachments, recordings knowledge base, TTFT fix feat(chat): summarize attachments + recordings knowledge base Jun 30, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/components/ChatInput/index.tsx`:
- Around line 107-110: The attachment removal in onSummarizeAttachment is
unconditional, so update the summarize flow to only call removeAttachment after
a confirmed successful handleSummarize() result. Adjust
useSummarizeAttachment/handleSummarize to return an explicit success flag or
rethrow on failure, then have ChatInput decide removal based on that outcome so
failed or skipped summaries do not drop the original attachment.

In `@src/components/ChatInput/styles.ts`:
- Around line 58-69: The shared document filename style in documentName still
includes spacing from the old stacked layout, which now misaligns the text when
rendered inside documentNameRow. Split the style into separate variants in
styles.ts: keep the centered/stacked spacing only for the audio-only or vertical
layout, and remove the top margin from the inline filename style used with the
file icon. Update the relevant consumer(s) to use the appropriate style variant
so documentNameRow and the filename stay vertically aligned.

In `@src/services/generationToolLoop.ts`:
- Around line 519-537: The exact-time note is being regenerated on every
tool-loop pass in augmentSystemPromptForTools, which can change the latest user
message and move the cache boundary. Precompute the exact-time note once per
turn and thread it through the generation loop instead of calling
buildExactTimeNote() inside the prompt augmentation path; use the existing
augmentSystemPromptForTools flow and the exactTimeAppended logic to keep the
appended note stable across iterations.

In `@src/services/llm.ts`:
- Around line 296-308: `generateWithMaxTokens` only appends streamed
`data.token` values, so final completion payloads can be dropped for
tool-selection/summarization. Update the completion callback in
`LLMService.generateWithMaxTokens` to also read the final result fields already
used elsewhere, such as `completionResult.content` or `completionResult.text`,
and append that fallback when no token deltas are streamed. Keep the change
localized to the `safeCompletion`/`ctx.completion` flow so the return value
still comes from `fullResponse.trim()`.

In `@src/services/rag/database.ts`:
- Around line 22-23: `metadata` is being exposed as a serialized JSON string on
the read side, which breaks symmetry with `Chunk.metadata` and forces callers to
parse it themselves. Update the read-side types in the database service to keep
`metadata` structured like `Chunk.metadata`, and parse the stored JSON once
inside `getEmbeddingsByProject()` and `getChunksByProject()` before returning
results so both the main and fallback paths use the same object shape.
- Around line 65-70: The migration in ensureReady() is swallowing all errors
around the rag_chunks metadata ALTER TABLE step; update the try/catch so
Database.ensureReady() only ignores the expected duplicate-column case and
rethrows any other failure. Use the existing rag_chunks metadata migration block
to detect that specific error from this.db.executeSync('ALTER TABLE rag_chunks
ADD COLUMN metadata TEXT') and let unexpected migration errors propagate so
callers don’t proceed with a broken schema.

In `@src/services/transcriptSummarizer.ts`:
- Around line 114-122: The combine logic in transcriptSummarizer’s budget
calculations is reusing the chunk budget for the final pass, which can let
inputs through that still exceed the combine reservation. Add a separate
final-combine input budget in the same budgeting section near
ctxLength/inputBudgetTokens/chunkCharBudget, and use the FINAL_SUMMARY_TOKENS
reservation when deciding whether the merged partials are small enough for the
last combine. Update the reduce/combine threshold checks in the summarization
flow so the final pass uses its own budget instead of chunkCharBudget.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 3407e1dd-1d10-41a0-bc4d-7319e1642cda

📥 Commits

Reviewing files that changed from the base of the PR and between e65db82 and 291c0ce.

📒 Files selected for processing (15)
  • src/components/ChatInput/Attachments.tsx
  • src/components/ChatInput/index.tsx
  • src/components/ChatInput/styles.ts
  • src/components/ChatInput/useSummarizeAttachment.ts
  • src/services/chatAttachmentInbox.ts
  • src/services/generationToolLoop.ts
  • src/services/index.ts
  • src/services/llm.ts
  • src/services/rag/chunking.ts
  • src/services/rag/database.ts
  • src/services/rag/index.ts
  • src/services/rag/retrieval.ts
  • src/services/transcriptSummarizer.ts
  • src/stores/projectStore.ts
  • src/types/index.ts

Comment on lines +107 to +110
const { summarizingId, handleSummarize } = useSummarizeAttachment();
const onSummarizeAttachment = async (attachment: MediaAttachment) => {
await handleSummarize(attachment);
removeAttachment(attachment.id);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Only remove the attachment after a successful summary.

handleSummarize() catches its own failures and also returns early on some no-op paths, so this unconditional removeAttachment() drops the original attachment even when summarization failed or never started.

Suggested fix
   const { summarizingId, handleSummarize } = useSummarizeAttachment();
   const onSummarizeAttachment = async (attachment: MediaAttachment) => {
-    await handleSummarize(attachment);
-    removeAttachment(attachment.id);
+    const summarized = await handleSummarize(attachment);
+    if (summarized) removeAttachment(attachment.id);
   };

handleSummarize() should return a success flag (or rethrow on failure) so the caller can make this decision correctly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/components/ChatInput/index.tsx` around lines 107 - 110, The attachment
removal in onSummarizeAttachment is unconditional, so update the summarize flow
to only call removeAttachment after a confirmed successful handleSummarize()
result. Adjust useSummarizeAttachment/handleSummarize to return an explicit
success flag or rethrow on failure, then have ChatInput decide removal based on
that outcome so failed or skipped summaries do not drop the original attachment.

Comment on lines +58 to 69
documentNameRow: {
flexDirection: 'row' as const,
alignItems: 'center' as const,
gap: 6,
},
documentName: {
fontSize: 10,
fontFamily: FONTS.mono,
color: colors.textMuted,
textAlign: 'center' as const,
marginTop: 4,
},

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Split the inline filename style from the stacked one.

documentName still carries marginTop: 4 from the old centered layout, but it now also renders inside documentNameRow. That offsets the filename downward relative to the file icon. A separate inline text style, or moving the top margin to the audio-only variant, will avoid the misalignment.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/components/ChatInput/styles.ts` around lines 58 - 69, The shared document
filename style in documentName still includes spacing from the old stacked
layout, which now misaligns the text when rendered inside documentNameRow. Split
the style into separate variants in styles.ts: keep the centered/stacked spacing
only for the audio-only or vertical layout, and remove the top margin from the
inline filename style used with the file icon. Update the relevant consumer(s)
to use the appropriate style variant so documentNameRow and the filename stay
vertically aligned.

Comment on lines +519 to +537
// System prompt gets only the STABLE date (changes once a day) + tool guidance, so the
// system+tools prefix stays cacheable turn-to-turn.
const updatedSys = { ...sys, content: existing + TOOL_BEHAVIOR_GUIDANCE + buildDateContext() + extHints };
const out = [...messages.slice(0, sysIdx), updatedSys, ...messages.slice(sysIdx + 1)];

// For time-sensitive (calendar) tools, append the EXACT time to the latest user
// message instead of the system prefix — keeps the big prefix cacheable while still
// giving the model sub-day precision.
const precise = enabledToolIds.some(id => TIME_SENSITIVE_TOOL_IDS.includes(id));
const updated = { ...sys, content: existing + TOOL_BEHAVIOR_GUIDANCE + buildDateTimeContext(precise) + extHints };
return [...messages.slice(0, sysIdx), updated, ...messages.slice(sysIdx + 1)];
let exactTimeAppended = false;
if (precise) {
for (let i = out.length - 1; i >= 0; i--) {
if (out[i].role === 'user' && typeof out[i].content === 'string') {
out[i] = { ...out[i], content: (out[i].content as string) + buildExactTimeNote() };
exactTimeAppended = true;
break;
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 Performance & Scalability | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm whether augmentSystemPromptForTools (via callLLMWithRetry) runs once per
# tool-loop iteration, which would re-append a fresh timestamp each iteration.
rg -nP -C4 '\bcallLLMWithRetry\s*\(' src/services/generationToolLoop.ts
echo '--- loop structure around callLLMWithRetry ---'
ast-grep run --pattern 'callLLMWithRetry($$$)' --lang ts src/services/generationToolLoop.ts

Repository: off-grid-ai/mobile

Length of output: 3076


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the tool-loop and prompt augmentation flow around the cited lines.
sed -n '500,820p' src/services/generationToolLoop.ts

echo '--- search for augmentSystemPromptForTools usage/definition ---'
rg -n -C 4 'augmentSystemPromptForTools|buildExactTimeNote|buildDateContext|TIME_SENSITIVE_TOOL_IDS' src/services/generationToolLoop.ts

Repository: off-grid-ai/mobile

Length of output: 19829


Cache the exact-time note once per turn. buildExactTimeNote() runs on every tool-loop pass, so the latest user message can change between iterations and shift the cache boundary into the user/tool-result tail. Thread one precomputed note through the loop instead of regenerating it per augmentSystemPromptForTools() call.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/generationToolLoop.ts` around lines 519 - 537, The exact-time
note is being regenerated on every tool-loop pass in
augmentSystemPromptForTools, which can change the latest user message and move
the cache boundary. Precompute the exact-time note once per turn and thread it
through the generation loop instead of calling buildExactTimeNote() inside the
prompt augmentation path; use the existing augmentSystemPromptForTools flow and
the exactTimeAppended logic to keep the appended note stable across iterations.

Comment thread src/services/llm.ts
Comment on lines 296 to 308
let fullResponse = '';
const ctx = this.context;
// These internal generations (summarize, tool-selection) never want the
// model to "think" - reasoning wastes the token budget, is slow + hot, and
// leaks into the output. Force thinking OFF (for models that gate it via the
// thinking channel; prose chain-of-thought is additionally curbed by prompts).
const params = { messages: oaiMessages, ...buildCompletionParams(settings, { disableCtxShift: this.shouldDisableCtxShift() }), ...buildThinkingCompletionParams(false, this.isGemma4Model()), n_predict: maxTokens };
const completionWork = safeCompletion(ctx, () => ctx.completion(
{ messages: oaiMessages, ...buildCompletionParams(settings, { disableCtxShift: this.shouldDisableCtxShift() }), n_predict: maxTokens },
(data) => { if (this.isGenerating && data.token) fullResponse += data.token; },
params,
(data) => { if (this.isGenerating && data.token) { fullResponse += data.token; onToken?.(data.token); } },
), 'generateWithMaxTokens');
this.activeCompletionPromise = completionWork.then(() => { }, () => { });
try { await completionWork; return fullResponse.trim(); } finally { this.isGenerating = false; this.activeCompletionPromise = null; }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
git ls-files src/services/llm.ts
wc -l src/services/llm.ts
sed -n '1,220p' src/services/llm.ts
printf '\n--- MIDDLE ---\n'
sed -n '220,380p' src/services/llm.ts
printf '\n--- END ---\n'
sed -n '380,520p' src/services/llm.ts

Repository: off-grid-ai/mobile

Length of output: 27561


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "generateResponse|generateWithMaxTokens|completion\\(" src/services/llm.ts

Repository: off-grid-ai/mobile

Length of output: 934


🏁 Script executed:

#!/bin/bash
set -euo pipefail
wc -l src/services/llmHelpers.ts
rg -n "safeCompletion|getStreamingDelta|completion\\(|data\\.content|data\\.text|reasoning_content|reasoningContent" src/services/llmHelpers.ts src/services -g '!**/*.map'

Repository: off-grid-ai/mobile

Length of output: 7939


🏁 Script executed:

#!/bin/bash
set -euo pipefail
sed -n '1,260p' src/services/llmHelpers.ts
printf '\n--- TYPES/USAGES ---\n'
rg -n "type StreamToken|interface .*Completion|completion\\(" src -g '!**/*.map'

Repository: off-grid-ai/mobile

Length of output: 13571


🏁 Script executed:

#!/bin/bash
set -euo pipefail
wc -l src/services/providers/types.ts src/services/llmToolGeneration.ts
sed -n '1,200p' src/services/providers/types.ts
printf '\n--- TOOL GENERATION ---\n'
sed -n '1,240p' src/services/llmToolGeneration.ts

Repository: off-grid-ai/mobile

Length of output: 12691


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "content:|text:|reasoningContent|reasoning_content|CompletionResult|onToken" src/services/providers/types.ts src/services/llmToolGeneration.ts src/services/llmSafetyChecks.ts

Repository: off-grid-ai/mobile

Length of output: 984


Handle final completion payloads in generateWithMaxTokens

generateWithMaxTokens() only accumulates data.token, but this codebase already treats completionResult.content / completionResult.text as valid final output in similar paths. Add the same fallback here, or tool-selection/summarization can return an empty string when the backend doesn't stream token deltas.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/llm.ts` around lines 296 - 308, `generateWithMaxTokens` only
appends streamed `data.token` values, so final completion payloads can be
dropped for tool-selection/summarization. Update the completion callback in
`LLMService.generateWithMaxTokens` to also read the final result fields already
used elsewhere, such as `completionResult.content` or `completionResult.text`,
and append that fallback when no token deltas are streamed. Keep the change
localized to the `safeCompletion`/`ctx.completion` flow so the return value
still comes from `fullResponse.trim()`.

Comment on lines +22 to +23
// JSON string of per-chunk metadata (recordingId, startMs, eventTitle, ...) or null.
metadata?: string | null;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Keep metadata structured across the DB boundary.

Chunk.metadata is written as an object, but these read-side types now expose the serialized JSON string. That pushes JSON.parse onto every caller and makes the new recording metadata easy to misuse on both the semantic path and the fallback path. Parsing once in getEmbeddingsByProject() / getChunksByProject() and typing these fields like Chunk.metadata will keep the contract symmetric.

Also applies to: 33-33

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/rag/database.ts` around lines 22 - 23, `metadata` is being
exposed as a serialized JSON string on the read side, which breaks symmetry with
`Chunk.metadata` and forces callers to parse it themselves. Update the read-side
types in the database service to keep `metadata` structured like
`Chunk.metadata`, and parse the stored JSON once inside
`getEmbeddingsByProject()` and `getChunksByProject()` before returning results
so both the main and fallback paths use the same object shape.

Comment on lines +65 to +70
// Older installs created rag_chunks without the metadata column; add it.
// Throws "duplicate column" on DBs that already have it - safe to ignore.
try {
this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT');
} catch {
// column already exists

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Only suppress the duplicate-column migration error.

The bare catch {} also hides real migration failures, so ensureReady() can succeed even though later reads/writes against rag_chunks.metadata will fail. Please rethrow anything except the expected duplicate-column case.

Suggested fix
       // Older installs created rag_chunks without the metadata column; add it.
       // Throws "duplicate column" on DBs that already have it - safe to ignore.
       try {
         this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT');
-      } catch {
-        // column already exists
+      } catch (error) {
+        const message = error instanceof Error ? error.message : String(error);
+        if (!message.toLowerCase().includes('duplicate column')) {
+          throw error;
+        }
       }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Older installs created rag_chunks without the metadata column; add it.
// Throws "duplicate column" on DBs that already have it - safe to ignore.
try {
this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT');
} catch {
// column already exists
// Older installs created rag_chunks without the metadata column; add it.
// Throws "duplicate column" on DBs that already have it - safe to ignore.
try {
this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT');
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
if (!message.toLowerCase().includes('duplicate column')) {
throw error;
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/rag/database.ts` around lines 65 - 70, The migration in
ensureReady() is swallowing all errors around the rag_chunks metadata ALTER
TABLE step; update the try/catch so Database.ensureReady() only ignores the
expected duplicate-column case and rethrows any other failure. Use the existing
rag_chunks metadata migration block to detect that specific error from
this.db.executeSync('ALTER TABLE rag_chunks ADD COLUMN metadata TEXT') and let
unexpected migration errors propagate so callers don’t proceed with a broken
schema.

Comment on lines +114 to +122
const ctxLength = llmService.getPerformanceSettings().contextLength || 2048;
const inputBudgetTokens = Math.max(
256,
ctxLength - CHUNK_SUMMARY_TOKENS - INSTRUCTION_OVERHEAD_TOKENS - SAFETY_MARGIN_TOKENS,
);
const chunkCharBudget = inputBudgetTokens * CHARS_PER_TOKEN;
// Map split is capped smaller than the full budget so each part prefills
// fast and streams sooner; reduce/combine still use the full chunkCharBudget.
const mapCharBudget = Math.min(chunkCharBudget, MAP_INPUT_TOKEN_TARGET * CHARS_PER_TOKEN);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Use a separate input budget for the final combine pass.

chunkCharBudget is computed with CHUNK_SUMMARY_TOKENS and then reused to decide when the merged partials are “small enough” for the final combine, but the final combine reserves FINAL_SUMMARY_TOKENS. Inputs between those two budgets skip another reduce round and can still overflow the last pass.

Suggested fix
-      const inputBudgetTokens = Math.max(
+      const reduceInputBudgetTokens = Math.max(
         256,
         ctxLength - CHUNK_SUMMARY_TOKENS - INSTRUCTION_OVERHEAD_TOKENS - SAFETY_MARGIN_TOKENS,
       );
-      const chunkCharBudget = inputBudgetTokens * CHARS_PER_TOKEN;
+      const finalInputBudgetTokens = Math.max(
+        256,
+        ctxLength - FINAL_SUMMARY_TOKENS - INSTRUCTION_OVERHEAD_TOKENS - SAFETY_MARGIN_TOKENS,
+      );
+      const reduceCharBudget = reduceInputBudgetTokens * CHARS_PER_TOKEN;
+      const finalCharBudget = finalInputBudgetTokens * CHARS_PER_TOKEN;
@@
-      const mapCharBudget = Math.min(chunkCharBudget, MAP_INPUT_TOKEN_TARGET * CHARS_PER_TOKEN);
+      const mapCharBudget = Math.min(reduceCharBudget, MAP_INPUT_TOKEN_TARGET * CHARS_PER_TOKEN);
@@
-      while (combined.length > chunkCharBudget && round < MAX_REDUCE_ROUNDS) {
+      while (combined.length > finalCharBudget && round < MAX_REDUCE_ROUNDS) {
         round += 1;
         this.emit({ phase: 'reducing', round }, onProgress);
-        const reChunks = splitIntoChunks(combined, chunkCharBudget);
+        const reChunks = splitIntoChunks(combined, reduceCharBudget);

Also applies to: 150-165

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/transcriptSummarizer.ts` around lines 114 - 122, The combine
logic in transcriptSummarizer’s budget calculations is reusing the chunk budget
for the final pass, which can let inputs through that still exceed the combine
reservation. Add a separate final-combine input budget in the same budgeting
section near ctxLength/inputBudgetTokens/chunkCharBudget, and use the
FINAL_SUMMARY_TOKENS reservation when deciding whether the merged partials are
small enough for the last combine. Update the reduce/combine threshold checks in
the summarization flow so the final pass uses its own budget instead of
chunkCharBudget.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant