Add correct timestamp for segment based on session timer #1359

yujonglee · 2025-08-17T17:16:41Z

Summary by cubic

Adds session-based start/end timestamps to audio chunks and uses them to set segment start/end during transcription. This fixes inaccurate segment timing and falls back when timestamps aren’t provided.

New Features
- AudioChunk now includes start_timestamp_ms and end_timestamp_ms.
- SimpleAudioChunk stores these fields; streaming fills them from the session timer.
- Transcription sets segment.start/end from these timestamps (ms to seconds).
- VAD test updated to ignore the new fields.
Migration
- If you implement AudioChunk, add start_timestamp_ms and end_timestamp_ms. Return None if not available.

coderabbitai · 2025-08-17T17:16:46Z

📝 Walkthrough

Walkthrough

Adds optional per-chunk start/end millisecond timestamps through the streaming and transcription pipeline: extends the AudioChunk trait and SimpleAudioChunk, threads timestamps into process_transcription, annotates produced segments when present, and updates the VAD producer to populate chunk timestamps.

Changes

Cohort / File(s)	Summary
Transcription streaming core `crates/whisper-local/src/stream.rs`	Added `start_timestamp_ms`/`end_timestamp_ms` to `AudioChunk` trait and `SimpleAudioChunk` (fields + accessors). Updated `process_transcription` signature to accept `timestamps: (Option<usize>, Option<usize>)`, propagate timestamps to produced `Segment`s (ms→s) when present, and change transcription error handling to log and terminate the stream (return `Poll::Ready(None)`). Updated call sites to pass timestamps (rodio: `(None,None)`, metadata path: chunk timestamps).
Producer (VAD streaming) `crates/transcribe-whisper-local/src/service/streaming.rs`	`process_vad_stream` now sets `start_timestamp_ms` and `end_timestamp_ms` on constructed `SimpleAudioChunk` from the input chunk. No control-flow changes.

Sequence Diagram(s)

sequenceDiagram
  participant Producer as Producer (VAD / Rodio)
  participant Stream as Stream Pipeline
  participant Proc as process_transcription
  participant Whisper as Whisper Engine
  participant Down as Downstream Consumer

  Producer->>Stream: AudioChunk(samples, meta, start/end_ts?)
  Stream->>Proc: samples, meta, timestamps (Option, Option)
  Proc->>Whisper: transcribe(samples, meta)
  alt success
    Whisper-->>Proc: segments
    Proc->>Proc: attach start/end (if provided, ms → s)
    Proc-->>Stream: Segment stream (with timestamps when present)
    Stream-->>Down: Segments
  else error
    Whisper-->>Proc: error
    Proc-->>Stream: terminate (Ready(None))
    Stream-->>Down: end of stream
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Filter silence #1200 — Modifies SimpleAudioChunk/AudioChunk and streaming logic; likely touches the same timestamp propagation and producer construction code.

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f19f856 and 1c6aa58.

📒 Files selected for processing (2)

crates/transcribe-whisper-local/src/service/streaming.rs (1 hunks)
crates/whisper-local/src/stream.rs (6 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

crates/transcribe-whisper-local/src/service/streaming.rs

🧰 Additional context used

📓 Path-based instructions (1)

**/*.{js,ts,tsx,rs}

⚙️ CodeRabbit Configuration File

**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".

Files:

crates/whisper-local/src/stream.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build (macos, x86_64-apple-darwin, macos-latest)
GitHub Check: build (macos, aarch64-apple-darwin, macos-latest)

🔇 Additional comments (6)

crates/whisper-local/src/stream.rs (6)

23-25: AudioChunk: addition of start/end timestamp accessors looks good

The API matches the PR intent and keeps timestamps optional for fallback behavior.

31-33: SimpleAudioChunk: timestamp fields align with trait; Default remains valid

Option defaults to None, so the derive(Default) behavior is preserved.

44-50: Accessor implementations are correct and efficient

Returning Option by value leverages Copy semantics; no cloning overhead.

132-132: Rodio path: explicit (None, None) preserves current behavior

Good call to keep timestamps absent for this path.

172-173: Metadata path: correctly plumbs per-chunk timestamps into transcription

Reads from the chunk and forwards as intended.

Also applies to: 179-179

197-197: process_transcription: signature update is coherent with call sites

The tuple parameter keeps the function signature concise without over-structuring.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch timestamp-wip

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

crates/whisper-local/src/stream.rs (3)

31-33: Consider using u64 for timestamps (ms) to avoid arch-dependent limits

Using usize ties max timestamp to pointer width. If sessions can exceed ~49 days (on 32-bit) or you want platform consistency, prefer u64.

Example:

-    pub start_timestamp_ms: Option<usize>,
-    pub end_timestamp_ms: Option<usize>,
+    pub start_timestamp_ms: Option<u64>,
+    pub end_timestamp_ms: Option<u64>,

This change would cascade to the trait and call sites.

192-199: Unnecessary lifetime parameter on process_transcription

The explicit 'a isn’t required since no borrows escape the function. Removing it simplifies the signature.

Apply:

-fn process_transcription<'a>(
-    whisper: &'a mut Whisper,
-    samples: &'a [f32],
-    current_segment_task: &'a mut Option<Pin<Box<dyn Stream<Item = Segment> + Send>>>,
+fn process_transcription(
+    whisper: &mut Whisper,
+    samples: &[f32],
+    current_segment_task: &mut Option<Pin<Box<dyn Stream<Item = Segment> + Send>>>,
     meta: Option<serde_json::Value>,
     timestamps: (Option<usize>, Option<usize>),
 ) -> Poll<Option<Segment>> {

211-214: Guard against invalid timestamps (end < start)

If out-of-order timestamps slip in, you’ll create negative durations downstream. Add a simple guard before setting segment times.

-                    if let (Some(start_ms), Some(end_ms)) = timestamps {
-                        segment.start = start_ms as f64 / 1000.0;
-                        segment.end = end_ms as f64 / 1000.0;
-                    }
+                    if let (Some(start_ms), Some(end_ms)) = timestamps {
+                        if end_ms >= start_ms {
+                            segment.start = start_ms as f64 / 1000.0;
+                            segment.end = end_ms as f64 / 1000.0;
+                        }
+                    }

crates/transcribe-whisper-local/src/service/streaming.rs (1)

300-302: Treat zero timestamps as “unknown” to avoid overriding model times with 0.0s

If VAD emits 0 for unknown times, always wrapping with Some(...) forces segments to [0.0, 0.0]. Prefer mapping zeros to None and let downstream defaults apply.

Apply:
-                    start_timestamp_ms: Some(chunk.start_timestamp_ms),
-                    end_timestamp_ms: Some(chunk.end_timestamp_ms),
+                    start_timestamp_ms: (chunk.start_timestamp_ms != 0).then_some(chunk.start_timestamp_ms),
+                    end_timestamp_ms: (chunk.end_timestamp_ms != 0).then_some(chunk.end_timestamp_ms),

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2ec859c and 59f6926.

📒 Files selected for processing (3)

crates/transcribe-whisper-local/src/service/streaming.rs (1 hunks)
crates/vad/src/lib.rs (1 hunks)
crates/whisper-local/src/stream.rs (6 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.{js,ts,tsx,rs}

⚙️ CodeRabbit Configuration File

**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".

Files:

crates/vad/src/lib.rs
crates/transcribe-whisper-local/src/service/streaming.rs
crates/whisper-local/src/stream.rs

🔇 Additional comments (4)

crates/whisper-local/src/stream.rs (3)

44-50: LGTM on accessors

Returning Option<usize> by value is fine here; Option is Copy. No extra clones or lifetimes involved.

131-132: Explicitly passing (None, None) for Rodio path is appropriate

Acknowledged that Rodio-based streams don’t have chunk-timestamps; forwarding (None, None) keeps behavior consistent.

172-173: Timestamp propagation from metadata path: LGTM

Forwarding (chunk.start_timestamp_ms(), chunk.end_timestamp_ms()) to process_transcription is the right spot to inject timing metadata.

Also applies to: 179-180

crates/vad/src/lib.rs (1)

57-57: Pattern with .. keeps tests resilient to struct evolution

Good change. It future-proofs the test against added fields (e.g., timestamps) while still extracting samples.

crates/whisper-local/src/stream.rs

cubic-dev-ai

cubic analysis

No issues found across 3 files. Review in cubic

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

crates/whisper-local/src/stream.rs (2)
23-25: Trait extension looks good; consider default implementations to avoid semver break

Adding required methods forces all downstream implementers to update. If you want to soften the break, provide default impls that return None.

Apply this diff to add defaults:
 pub trait AudioChunk: Send + 'static {
     fn samples(&self) -> &[f32];
     fn meta(&self) -> Option<serde_json::Value>;
-    fn start_timestamp_ms(&self) -> Option<usize>;
-    fn end_timestamp_ms(&self) -> Option<usize>;
+    fn start_timestamp_ms(&self) -> Option<usize> { None }
+    fn end_timestamp_ms(&self) -> Option<usize> { None }
 }
31-33: Prefer u64 for millisecond timestamps

On 32-bit targets, usize may cap at ~49.7 days of ms. Using u64 avoids platform-dependent limits and makes intent explicit.

If you decide to switch, update the trait, struct fields, and call sites accordingly in the same PR to keep types consistent.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 59f6926 and f19f856.

📒 Files selected for processing (2)

crates/transcribe-whisper-local/src/service/streaming.rs (1 hunks)
crates/whisper-local/src/stream.rs (6 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

crates/transcribe-whisper-local/src/service/streaming.rs

🧰 Additional context used

📓 Path-based instructions (1)

**/*.{js,ts,tsx,rs}

⚙️ CodeRabbit Configuration File

**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".

Files:

crates/whisper-local/src/stream.rs

🔇 Additional comments (5)

crates/whisper-local/src/stream.rs (5)

44-50: AudioChunk impl for SimpleAudioChunk — LGTM

The new accessors correctly forward the optional timestamps.

131-132: Rodio path fallback — LGTM

Passing (None, None) preserves existing behavior when timestamps aren’t available.

172-173: Timestamp threading from chunk — LGTM

Forwarding start/end from the chunk via the trait accessors is correct and keeps the stream generic over AudioChunk.

179-180: Passing timestamps through to processing — LGTM

The API threading is clear and localized; good separation of concerns.

197-198: process_transcription signature change — all call sites updated

Both invocations in crates/whisper-local/src/stream.rs now include the new timestamps parameter, and no other calls to process_transcription were found in the codebase. LGTM.

coderabbitai · 2025-08-18T07:13:45Z

crates/whisper-local/src/stream.rs

+                    if let (Some(start_ms), Some(end_ms)) = timestamps {
+                        segment.start = start_ms as f64 / 1000.0;
+                        segment.end = end_ms as f64 / 1000.0;
+                    }


💡 Verification agent

🧩 Analysis chain

Bug: Overwrites per-segment timing with chunk bounds; offset instead of replacing

This sets every segment’s start/end to the same chunk-level [start, end], discarding per-segment timing from Whisper and making all segments identical in time. Instead, add the chunk start offset to each segment’s existing times and optionally clamp to the chunk end.

Apply this diff to offset/clamp rather than overwrite:

- if let (Some(start_ms), Some(end_ms)) = timestamps { - segment.start = start_ms as f64 / 1000.0; - segment.end = end_ms as f64 / 1000.0; - } + // Add chunk start offset; keep model-provided intra-chunk timing. + if let (Some(start_ms), _) = timestamps { + let offset_s = start_ms as f64 / 1000.0; + segment.start += offset_s; + segment.end += offset_s; + } + // Optionally clamp segment end to the chunk end if provided. + if let (_, Some(end_ms)) = timestamps { + let end_limit_s = end_ms as f64 / 1000.0; + if segment.end > end_limit_s { + segment.end = end_limit_s; + } + }

To double-check expectations, confirm how Segment.start/end are used downstream and whether they currently assume absolute or relative times:

🏁 Script executed:

#!/bin/bash # Locate Segment definition and usages of start/end to infer expected semantics. rg -n -C3 -P 'struct\s+Segment\b' --type=rust rg -n -C3 -P '\bSegment\b' --type=rust | rg -n -C1 -P '\bstart\b|\bend\b'

Length of output: 3201

Bug: Preserve per-segment timing by offsetting, not overwriting

The current code at crates/whisper-local/src/stream.rs (lines 211–214) replaces every segment’s times with the enclosing chunk’s bounds, discarding the model’s intra-chunk timestamps. Instead, you should add the chunk start offset to each segment’s existing start/end and clamp the end to the chunk boundary if provided.

• Location: crates/whisper-local/src/stream.rs:211–214

Apply this patch:

- if let (Some(start_ms), Some(end_ms)) = timestamps { - segment.start = start_ms as f64 / 1000.0; - segment.end = end_ms as f64 / 1000.0; - } + // Add chunk start offset; keep model-provided intra-chunk timing. + if let (Some(start_ms), _) = timestamps { + let offset_s = start_ms as f64 / 1000.0; + segment.start += offset_s; + segment.end += offset_s; + } + // Clamp segment end to the chunk end if provided. + if let (_, Some(end_ms)) = timestamps { + let end_limit_s = end_ms as f64 / 1000.0; + if segment.end > end_limit_s { + segment.end = end_limit_s; + } + }

This ensures each segment retains its relative timing from Whisper while mapping to absolute positions in the full audio.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if let (Some(start_ms), Some(end_ms)) = timestamps {

segment.start = start_ms as f64 / 1000.0;

segment.end = end_ms as f64 / 1000.0;

}

// Add chunk start offset; keep model-provided intra-chunk timing.

if let (Some(start_ms), _) = timestamps {

let offset_s = start_ms as f64 / 1000.0;

segment.start += offset_s;

segment.end += offset_s;

}

// Clamp segment end to the chunk end if provided.

if let (_, Some(end_ms)) = timestamps {

let end_limit_s = end_ms as f64 / 1000.0;

if segment.end > end_limit_s {

segment.end = end_limit_s;

}

}

🤖 Prompt for AI Agents

In crates/whisper-local/src/stream.rs around lines 211 to 214, the code currently overwrites each segment’s start/end with the chunk bounds; instead, add the chunk start offset to the segment’s existing start/end (preserving the model’s intra-chunk timing), convert ms->seconds consistently, and if an enclosing chunk end is provided clamp the adjusted segment.end to that chunk end rather than overwriting it; ensure you only apply the offset when the segment has timestamps (Some) and leave segments without timestamps unchanged.

coderabbitai bot reviewed Aug 17, 2025

View reviewed changes

crates/whisper-local/src/stream.rs Show resolved Hide resolved

cubic-dev-ai bot reviewed Aug 17, 2025

View reviewed changes

yujonglee force-pushed the timestamp-wip branch from 59f6926 to f19f856 Compare August 18, 2025 07:08

coderabbitai bot reviewed Aug 18, 2025

View reviewed changes

wip

1c6aa58

yujonglee force-pushed the timestamp-wip branch from f19f856 to 1c6aa58 Compare August 18, 2025 07:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add correct timestamp for segment based on session timer #1359

Add correct timestamp for segment based on session timer #1359

Uh oh!

yujonglee commented Aug 17, 2025 •

edited by cubic-dev-ai bot

Loading

Uh oh!

coderabbitai bot commented Aug 17, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Aug 18, 2025

Uh oh!

Uh oh!

-                    if let (Some(start_ms), Some(end_ms)) = timestamps {
-                        segment.start = start_ms as f64 / 1000.0;
-                        segment.end = end_ms as f64 / 1000.0;
-                    }
+                    // Add chunk start offset; keep model-provided intra-chunk timing.
+                    if let (Some(start_ms), _) = timestamps {
+                        let offset_s = start_ms as f64 / 1000.0;
+                        segment.start += offset_s;
+                        segment.end += offset_s;
+                    }
+                    // Clamp segment end to the chunk end if provided.
+                    if let (_, Some(end_ms)) = timestamps {
+                        let end_limit_s = end_ms as f64 / 1000.0;
+                        if segment.end > end_limit_s {
+                            segment.end = end_limit_s;
+                        }
+                    }

Add correct timestamp for segment based on session timer #1359

Are you sure you want to change the base?

Add correct timestamp for segment based on session timer #1359

Uh oh!

Conversation

yujonglee commented Aug 17, 2025 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

coderabbitai bot commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

cubic analysis

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yujonglee commented Aug 17, 2025 •

edited by cubic-dev-ai bot

Loading

coderabbitai bot commented Aug 17, 2025 •

edited

Loading