Allow protocol defined types for model inputs and outputs #281

ZachNagengast · 2024-12-19T05:37:00Z

This change will allow arbitrary input and output types as part of the model protocols, supporting full MLX or MLTensor pipelines without the need to convert between types during inference.

This PR also contains some general fixes and cleanup

Uses the MelSpectrogram model input shapes for audio input length
- Breaking change: WhisperKit.windowSamples is now Constants.defaultWindowSamples
Fixed the timestamp token filter rules
- Transcripts will now have more timestamp tokens (segments) within each 30s window
Uses MLTensor operations for sampling on > iOS 18 and macOS 15 for a 2x speedup vs BNNS
CI and QoL upgrades

Add arbitrary length audio

* Support generic io for model inputs and outputs * Add speed factor to timing report * Use actor for early stop checks for better concurrency safety * Add io type protocol handling and tests * Formatting * Fix timestamp token filter logic and tests * Run unit tests on any branch in PR * Upload test failure results

a2they and others added 3 commits December 18, 2024 16:23

Freeze more enums

2ed122e

Audio input length from CoreML metadata

f63313f

Add arbitrary length audio

ZachNagengast requested review from a2they, atiorh and EduardoPach December 19, 2024 05:37

a2they approved these changes Dec 19, 2024

View reviewed changes

EduardoPach approved these changes Dec 19, 2024

View reviewed changes

atiorh approved these changes Dec 19, 2024

View reviewed changes

ZachNagengast merged commit d191654 into main Dec 19, 2024
33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow protocol defined types for model inputs and outputs #281

Allow protocol defined types for model inputs and outputs #281

Uh oh!

ZachNagengast commented Dec 19, 2024

Uh oh!

Uh oh!

Uh oh!

Allow protocol defined types for model inputs and outputs #281

Allow protocol defined types for model inputs and outputs #281

Uh oh!

Conversation

ZachNagengast commented Dec 19, 2024

Uh oh!

Uh oh!

Uh oh!