Releases · cactus-compute/cactus

18 Apr 17:49

github-actions

v1.14

40a7123

v1.14 Latest

Latest

What's Changed

Enable SDL2 live mic recording in chat build by @rshemet in #579
fix gemma4 conversion and cli by @jakmro in #581
Made non-thinking default for gemma4 & Introduce simultaneous multimodalitiy by @ParkiratS in #582
gemma4 vision memory optimizations by @jakmro in #590
added a prompt fix for better vision+audio prompt, removed code that … by @kar-m in #589
added default confidence and wired that for vlm by @kar-m in #591
Gemma4 fixes by @ParkiratS in #588
Gemma4 tool calling and RoPE fixes by @ncylich in #594
increased rolling window size for gemma4 by @kar-m in #593

Full Changelog: v1.13...v1.14

Contributors

rshemet, kar-m, and 3 other contributors

Assets 2

14 Apr 01:54

github-actions

v1.13

4cc8202

v1.13

What's Changed

Improve model publishing error handling in workflow by @jakmro in #546
Fix VLM prefill cache reuse image path slicing by @KayaanT in #545
Add cache cleanup for Hugging Face models in export_and_publish_model by @jakmro in #547
Add branch override option to publish workflow by @ncylich in #548
Fix Gemma 3n conversion memory usage by @ncylich in #549
Update blog URL in README.md by @amerkld in #544
Parakeet streaming fix by @ParkiratS in #551
Tool call prompt formatting by @jakmro in #558
Feature Cleanup by @ParkiratS in #559
Add Whisper v3 (large-v3) support by @ncylich in #557
Add LFM2.5 VL 450M by @yujonglee in #567
LFM2-VL-450M: fix garbled output (vision tower, template, kernel) by @ncylich in #565
Change default transcribe model to parakeet-tdt-0.6b-v3 by @rshemet in #566
Expose min_p and repetition_penalty in completion options by @DuFanYin in #560
Graph save load by @cattermelon1234 in #556
Pyannote features and optimizations by @jakmro in #571
Karen/needle by @kar-m in #574
fix apple i8mm detection to use runtime sysctl check by @DuFanYin in #562
Fix streaming transcribe by @jakmro in #576

New Contributors

@amerkld made their first contribution in #544
@DuFanYin made their first contribution in #560

Full Changelog: v1.12...v1.13

Contributors

KayaanT, DuFanYin, and 8 other contributors

Assets 2

02 Apr 16:26

github-actions

v1.12

636e09f

v1.12

What's Changed

Gemma 4
Versioned docs with quickstart and SDK chooser by @rshemet in #522
add CACTUS_CLOUD_API_BASE by @yujonglee in #521
Model weights discoverability by @jakmro in #520
fix: set telemetry framework to "rust" for Rust bindings by @rshemet in #519
fix: prefer local libcactus over system-installed in test build by @rshemet in #516
Engine updates: tool calling, compute_entropy, converter improvements by @ncylich in #517
cactus_prefill by @mhayes853 in #512
Missing torch ops by @cattermelon1234 in #518
Docs fixes by @jakmro in #524
docs: document custom vocabulary support for transcription by @ayushmk7 in #525
Youtu by @jakmro in #530
Clean up Hugging Face cache after model export by @jakmro in #533
Added parakeet optimizations for apple by @ParkiratS in #534
Cactus torch api clean by @cattermelon1234 in #529
add CACTUS_CLOUD_HEADERS support by @yujonglee in #531
Parakeet encoder optimization by @ParkiratS in #535
Update docs site_url to docs.cactuscompute.com by @rshemet in #527
Custom vocabulary support for Parakeet TDT by @rshemet in #532
Fix VLM crash when adding multiple images in multi-turn conversation by @FarooqMulla in #539
New LayerNorm Kernel by @nshejwalkar in #540
Add pyannote/segmentation-3.0 speaker diarization (10ms, 976× realtime) by @rshemet in #538
Tinyllama by @ParkiratS in #536

New Contributors

@cattermelon1234 made their first contribution in #518
@ayushmk7 made their first contribution in #525
@FarooqMulla made their first contribution in #539

Full Changelog: v1.11...v1.12

Contributors

FarooqMulla, rshemet, and 8 other contributors

Assets 2

11 Mar 17:36

github-actions

v1.11

a5acad3

v1.11

What's Changed

Fix/issue#490 by @lennartvoelz in #491
simplify and align sdks by @jakmro in #489
remove models by @jakmro in #492
Update model configurations and enhance workflow settings in publish_… by @jakmro in #495
Update workflow to use macos-latest instead of macos-latest-xlarge by @jakmro in #496
Add dynamic max_tokens estimation based on audio length in cactus_tra… by @jakmro in #499
macOS: link clang_rt.osx to fix SME2 (_arm_tpidr2*) link failures under rustc by @yujonglee in #498
Add FFI log control: cactus_log_set_level and cactus_log_set_callback by @yujonglee in #497
Karen/qwen3p5 by @kar-m in #481
CLI upgrades by @rshemet in #504
feat(stt): custom vocabulary biasing for all speech models by @vyomshah05 in #451
Add Gemma 3N (text-only) model support by @ncylich in #493
fix: make FunctionGemma prompt formatting strict by @lennartvoelz in #502
fix: apply logit bias before greedy sampling by @ncylich in #507
remove redundant file linking for tie_word_embeddings by @jakmro in #506
Port general engine improvements for TinyLlama by @ncylich in #513
Speech-to-Text Timestamps by @jakmro in #515

New Contributors

@lennartvoelz made their first contribution in #491

Full Changelog: v1.10...v1.11

Contributors

rshemet, kar-m, and 5 other contributors

Assets 2

04 Mar 07:25

github-actions

v1.10

da4c917

v1.10

What's Changed

Enhance model publishing workflow with detailed metadata and licenses by @jakmro in #459
Added parakeet to publish to hf yaml by @ParkiratS in #464
Update telemetry for supported platforms by @justinl66 in #465
added back moe weight conversion by @kar-m in #468
adjust manual workflow for model publish by @jakmro in #470
Parakeet blog by @ammesatyajit in #467
perf: add FP16 fast path for LayerNorm by @yujonglee in #433
Issue #406: Bilinear + Depthwise Optimizations by @PiyawanChaiprasit2006 in #466
ARM SME2: Accelerate MatMul FP16 by @aarav18 in #457
build: add Objective-C ARC support for NPU sources by @jakmro in #475
long transcription by @jakmro in #482
Language detection by @ParkiratS in #471
Parakeet tdt by @ParkiratS in #476
kotlin: expose forceTools in CompletionOptions by @rshemet in #484
Update model list in README and publish_to_hf.yml with new LiquidAI m… by @jakmro in #487
test: updated rag test conditions by @nshejwalkar in #488
optimize scale correction in cactus_attention_f16_h64 by @jakmro in #485
fix greedy sampler ignoring logit suppression by @jakmro in #486

New Contributors

@PiyawanChaiprasit2006 made their first contribution in #466
@aarav18 made their first contribution in #457

Full Changelog: v1.9...v1.10

Contributors

ammesatyajit, rshemet, and 8 other contributors

Assets 2

26 Feb 06:09

github-actions

v1.9

0f1afe2

v1.9

Whats New

50% faster int4
Parakeet models
LFM2-MOE models
BugFixes
Hybrid Inference

PRs

fix stt test and add cpp ci by @yujonglee in #413
add IRFFT by @yujonglee in #425
fixed lfm2 vlm lmhead issue that came in with hf 5.0.0 by @kar-m in #426
raspberry pi numebrs and linux fixes by @kar-m in #437
Added parakeet model by @ParkiratS in #443
Adding parakeet graph by @ParkiratS in #446
Parakeet kernel by @ParkiratS in #445
added cloud fallback and documentation+tests by @kar-m in #369
Parakeet FFI by @ParkiratS in #447
Parakeet convert and tests by @ParkiratS in #444
Hybrid transcription blog post by @rshemet in #449
Fixed missing engine changes by @ParkiratS in #453
feat(python): add context manager support for safe resource cleanup by @yogyam in #412
Completed ubuntu CICD pipeline by @ncylich in #455
Tie-embed-conversion-fix by @ncylich in #454
tiny graph fix and added benchmark by @kar-m in #456

Full Changelog: v1.8...v1.9

Breaking changes

Weights unfortunately need to be refreshed for this :(

Contributors

rshemet, kar-m, and 4 other contributors

Assets 2

24 Feb 02:06

github-actions

v1.8

b1c378b

v1.8

What's Changed

Kernel optimisations by @HenryNdubuaku in #397
Improve INT4 by @ncylich and @jrajala6 in #343
add einops dependency to requirements by @jakmro in #371
Add language parameter support for Whisper transcription by @rshemet in #384
added moe support for lfm by @kar-m in #374
Add raw FFI binding for Rust by @yujonglee in #382
fix: handle spaces in paths when running shell commands by @adithya-n05 in #377
fixing sentencepiece detection for transformers 5.0+ (still backwards compatible) by @ncylich in #373
Improve Telemetry by @mhayes853 in #372
proprietry commit by @HenryNdubuaku
Update performance metrics for iPhone 13 Mini and Galaxy A56 by @jakmro in #386
fix: improve version sorting and enhance model export tagging by @jakmro in #387
Add Rust SDK and language parameter documentation by @rshemet in #389
Basic addition of int4 functionality by @jrajala6 in #343
add scalar log by @yujonglee in #390
fix assertion and linux build in rust test by @yujonglee in #392
Justin/api fixes by @justinl66 in #380
Update telemetry by @justinl66 in #394
docs: add compatibility guidelines for runtime and weights by @jakmro in #398
add STFT_COMPLEX, derive stft_magnitude via graph composition by @yujonglee in #395

New Contributors

@yujonglee made their first contribution in #382
@adithya-n05 made their first contribution in #377

Full Changelog: v1.7...v1.8

Note:
This breaks the weights.

Contributors

HenryNdubuaku, rshemet, and 8 other contributors

Assets 2

18 Feb 06:22

HenryNdubuaku

v1.7

89cd747

v1.7

What's Changed

Brew setup @HenryNdubuaku
Cactus auth @HenryNdubuaku
Hybrid inference by the cactus team
Karen/vlm fix by @kar-m in #311
fixed moonshine state resetting and gemma3 4b layernorm loading by @kar-m in #317
fix: LFM2 multiple tool calls by @mhayes853 in #316
fix hf publish by @jakmro in #323
update models list by @jakmro in #324
Fixing pip command errors by @rshemet in #322
Add instructions for installing Ruby version for xcodeproj gem by @jakmro in #327
tests: remove duplicate vlm_multiturn test in runner by @AI-I224 in #332
fix: replace NSLog with CACTUS_LOG for iOS NPU debuggability by @KayaanT in #328
Kernel_attention optimization by @Ayan9074 in #319
M4airbenchmarks by @Ayan9074 in #336
docs: update cactus test command description for transcribe models (#297) by @AI-I224 in #339
Accelerate FP16 matmul via cblas_sgemm for Apple AMX by @KayaanT in #340
Fix hybrid attention sliding window for Gemma (#320) by @jrajala6 in #338
bench: update README benchmark with M2 MacBook Air results by @vyomshah05 in #335
docs: add iPad Pro (12.9") (6th Gen) benchmarks (#296) by @AI-I224 in #333
removed unused graph i/o methods by @ncylich in #345
feat: cpp-native telemetry by @justinl66 in #326
Update CPP Telemetry to point to main DB by @justinl66 in #350
update python bindings for stream transcribe by @jakmro in #351
Update CPP Telemetry by @justinl66 in #352
added only flag by @nshejwalkar in #347
Added warmups and increased iterations for performance testing by @nshejwalkar in #355
CMF Phone 2 Pro benchmarks by @jakmro in #356
Vad by @jakmro in #353
Cli reconvert by @jakmro in #357
Asr cloud merging by @kar-m in #348
Add optional cloud key prompt for transcribe by @rshemet in #359
HF support multiple precision options by @jakmro in #361
Add precision parameter to download_from_hf by @jakmro in #362
revert silero download logic by @jakmro in #365
Cactus clean now clears cache, Session metrics initialized properly for telemetry by @justinl66 in #363
Curl prepack by @kar-m in #358
Fix/f16 reduction accum by @vyomshah05 in #344
Update telemetry by @justinl66 in #366
Accelerate FP16 attention via cblas_sgemm for Apple AMX by @KayaanT in #346

New Contributors

@AI-I224 made their first contribution in #332
@jrajala6 made their first contribution in #338
@vyomshah05 made their first contribution in #335
@nshejwalkar made their first contribution in #347

Full Changelog: v1.6.0...v1.7

@mhayes853 API has breaking changes

Contributors

HenryNdubuaku, KayaanT, and 11 other contributors

Assets 2

01 Feb 01:06

HenryNdubuaku

v1.6.0

a195956

v1.6

What's Changed

Kernel Optimisations & advanced quantisation by @HenryNdubuaku
Moonshine by @kar-m
HF publish by @jakmro
Streaming API by @jakmro
Linux ARM support by @ncylich
Stop generation on model end token by @Ayan9074
i8MM runtime detection @mhayes853

FFI Note: This break API

Contributors

HenryNdubuaku, kar-m, and 4 other contributors

Assets 2

09 Jan 23:27

HenryNdubuaku

v1.5

776e240

v1.5

What's Changed

Groupwise quantisation by @HenryNdubuaku
Speech-To-Text streaming by @jakmro
KV Quntisation by @HenryNdubuaku
Evals by @justinl66 @ParkiratS
INT4 support by @HenryNdubuaku
Rust bindings by @mrsarac

Bindings: Please check Cactus FFIs again @jakmro @mrsarac @mhayes853

Contributors

mrsarac, HenryNdubuaku, and 4 other contributors

Assets 2

1 Join discussion

Releases: cactus-compute/cactus

v1.14

What's Changed

Contributors

Uh oh!

v1.13

What's Changed

New Contributors

Contributors

Uh oh!

v1.12

What's Changed

New Contributors

Contributors

Uh oh!

v1.11

What's Changed

New Contributors

Contributors

Uh oh!

v1.10

What's Changed

New Contributors

Contributors

Uh oh!

v1.9

Whats New

PRs

Breaking changes

Contributors

Uh oh!

v1.8

What's Changed

New Contributors

Contributors

Uh oh!

v1.7

What's Changed

New Contributors

Contributors

Uh oh!

v1.6

What's Changed

Contributors

Uh oh!

v1.5

What's Changed

Contributors

Uh oh!