feat: wandler (onnx) engine with cuda acceleration by TimPietruskyRunPod · Pull Request #156 · runpod-labs/a2go

TimPietruskyRunPod · 2026-04-13T07:54:55Z

Summary

Adds Wandler as a third inference engine alongside llama.cpp and MLX
Full end-to-end support: CLI --engine flag with auto-detection, Docker image install, entrypoint with CUDA acceleration, a2go doctor on Mac, deploy output in site
Separates LFM 2 and LFM 2.5 into distinct catalog entries, adds LFM 2.5 GGUF variant

Changes

CLI (a2go/cmd/):

--engine flag on a2go run with auto-detection from catalog
execRunWandler() run path with health checks and gateway setup
StartWandler() service function
Wandler engine recognized in model validation and listing
a2go doctor installs wandler via npm on Mac

Docker (Dockerfile.unified, scripts/entrypoint-unified.sh):

Wandler installed via npm install -g wandler@latest
Entrypoint detects NVIDIA GPU and passes --device cuda
LD_LIBRARY_PATH set for cuDNN so onnxruntime-node can use CUDA execution provider

Registry:

Wandler entry in engines.json
LFM 2.5 1.2B GGUF variant added (LiquidAI/LFM2.5-1.2B-Instruct-GGUF)
LFM 2 / LFM 2.5 separated into distinct families
Display names fixed: "LFM 2", "LFM 2.5" (proper spacing)

Site:

Deploy output generates --engine wandler when Wandler models are selected
Wandler variant preserved in deploy commands (not overridden by platform resolution)

Test plan

Go CLI builds and vets clean
TypeScript compiles clean
Shell entrypoint syntax valid
Site: engine filter shows wandler models, engine pills switch variants, deploy output shows correct commands
Runpod RTX 5090: wandler --device cuda serves LFM 2.5 ONNX with CUDA acceleration
Runpod RTX 5090: Hermes gateway proxies to Wandler successfully

vercel · 2026-04-13T07:55:00Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
a2go	Ready	Preview, Comment	Apr 15, 2026 9:29am

- Add engine filter UI (platform, engine, type filter groups in popover) - Add 9 ONNX model registry configs (wandler engine) - Add engine-resolver.ts for centralized platform+engine resolution - Add platform-state.ts for per-platform draft persistence - Redesign platform selector as deselectable pills (no web platform) - Add EngineRow to SelectedModels for switching between llamacpp/mlx/wandler - Add engineCategory field to CatalogModel via resolveEngine() - Add wandler to model.schema.json engine enum - Refactor ModelCatalog to use search+filter popovers - Update url-state.ts to use platform param (backward compat with os param)

- add wandler to engines.json, Dockerfile.unified, and docker entrypoint - add --engine flag to cli with auto-detection from catalog - add StartWandler() service and execRunWandler() run path - install wandler in a2go doctor on mac - entrypoint detects nvidia gpu and passes --device cuda - set LD_LIBRARY_PATH for cudnn so onnxruntime-node can use cuda - deploy output generates --engine wandler when wandler models selected - separate lfm 2 and lfm 2.5 into distinct catalog entries - add lfm 2.5 1.2b gguf variant (LiquidAI/LFM2.5-1.2B-Instruct-GGUF) - fix model display names: "LFM2" -> "LFM 2", "LFM2.5" -> "LFM 2.5"

- wandler runs llm + stt in one process: entrypoint scans for wandler audio model and passes --stt alongside --llm - audio service case skips when wandler llm already handles stt - StartWandler() and execRunWandler() now accept stt model - remove kokoro-82m-onnx (wandler doesn't support tts)

the agent tab prompt now says "with the wandler engine" when wandler models are selected, so the a2go skill knows which engine to use.

- agent prompt now shows engine for all engines (llama.cpp, mlx, wandler) not just wandler — future-proofs for additional engines - update a2go skill in repo with --engine flag docs and wandler section

vercel Bot deployed to Preview April 13, 2026 18:17 View deployment

TimPietruskyRunPod mentioned this pull request Apr 13, 2026

feat(site): platform-first redesign with web platform support #154

Closed

6 tasks

TimPietruskyRunPod force-pushed the feat/engine-filter-onnx branch from 3d5fdbc to ccca2f5 Compare April 13, 2026 18:24

vercel Bot deployed to Preview April 13, 2026 18:24 View deployment

vercel Bot deployed to Preview April 15, 2026 08:26 View deployment

TimPietruskyRunPod changed the title ~~feat(site): engine filter system with onnx/wandler model support~~ feat: wandler (onnx) engine with cuda acceleration Apr 15, 2026

vercel Bot deployed to Preview April 15, 2026 09:12 View deployment

fix: include wandler engine in agent deploy prompt

99058ad

the agent tab prompt now says "with the wandler engine" when wandler models are selected, so the a2go skill knows which engine to use.

vercel Bot deployed to Preview April 15, 2026 09:19 View deployment

fix: always include engine in agent deploy prompt, update a2go skill

a2de9eb

- agent prompt now shows engine for all engines (llama.cpp, mlx, wandler) not just wandler — future-proofs for additional engines - update a2go skill in repo with --engine flag docs and wandler section

vercel Bot deployed to Preview April 15, 2026 09:29 View deployment

TimPietruskyRunPod merged commit 17c05f2 into main Apr 15, 2026
11 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: wandler (onnx) engine with cuda acceleration#156

feat: wandler (onnx) engine with cuda acceleration#156
TimPietruskyRunPod merged 5 commits into
mainfrom
feat/engine-filter-onnx

TimPietruskyRunPod commented Apr 13, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TimPietruskyRunPod commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Uh oh!

vercel Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

TimPietruskyRunPod commented Apr 13, 2026 •

edited

Loading

vercel Bot commented Apr 13, 2026 •

edited

Loading