Fix TDT beam search timestamp alignment #14912
Open
+130
−38
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do ?
Fixes two TDT beam search timestamp issues:
timestamps=True(NoneType iteration error)Collection: ASR
Changelog
token_durationsinBatchedBeamHypsfor TDT models during beam search_timestamp_semanticsflag onHypothesisobjects_compute_offsets_tdtto handle both START and END timestamp semantics"char"field documentation fromList[str]toList[int](pre-existing bug:y_sequencecontains integer token IDs, not strings)Problem
Issue 1: Crash with beam search + timestamps
When using TDT beam search with
timestamps=True, the code crashes with:Root cause: Beam search (
BatchedBeamHyps) doesn't populate thetoken_durationfield that_compute_offsets_tdtrequires.Issue 2: ~160ms timestamp offset
After fixing the crash (by computing durations from timestamp diffs), beam search timestamps are still ~160ms late compared to greedy. This occurs because:
timestamp = timesteps + durationtimestamp = timesteps_compute_offsets_tdtassumed all timestamps were START timesSolution
Three-part approach:
BatchedBeamHyps(already receiving them during beam search, now stored)_timestamp_semanticsattribute onHypothesisobjects"end"(timestamps are END times)"start"(timestamps are START times)_compute_offsets_tdtbased on semantics:start_offset = timestamp - duration, end_offset = timestampstart_offset = timestamp, end_offset = timestamp + durationUsage
No API changes. The fix is transparent to users:
Impact
GitHub Actions CI
Ready for CI. Please add "Run CICD" label.
Before your PR is "Ready for review"
Pre checks:
PR Type:
Who can review?
@andrusenkoau
Per contributor guidelines, requesting review from ASR team: