Added API support for local Zonos. #73

PhialsBasement · 2025-02-14T14:09:38Z

Add REST API Endpoints

This PR adds FastAPI endpoints to Zonos, allowing programmatic access to the model's functionality alongside the existing Gradio interface.

Added Features

/models endpoint to list available models
/generate endpoint for text-to-speech generation
/speaker_embedding endpoint for creating speaker embeddings

Changes

Added FastAPI integration
Model responses are streamed as WAV files
Added Pydantic models for request validation

Testing

Tested with curl commands:

GET /models works as expected
POST /generate successfully generates audio
POST /speaker_embedding successfully creates embeddings

The implementation reuses existing model management code and runs alongside the Gradio interface on a different port.

darkacorn · 2025-02-14T18:36:58Z

i would maybe separate that into a different api file without gradio -
as you use one or the other most likely not the same time

and have a api consuming gradio ui - as a refactor - if that is the goal

also as a request -

maybe trying to keep in alignment to openai's tts api
that is very much integrated and supported everywhere,optional features as separate parameters

this would allow easy integration for 3rd party systems without much hassle and with sane defaults

Steveboy123 · 2025-02-14T22:08:51Z

Thank you @PhialsBasement , you are a lifesaver.

darkacorn · 2025-02-15T10:33:09Z

thats more akin to what im proposing .. ( mind you uploading a voice file for every request to a remote mashine maybe suboptimal)

we may even want to isolate loading transformer and hybrid at the same time so there is no need to swap over .. models are small enough to fit even in peanut cards - ( model loading time would hurt throughput ) ( optional pinning or full override-able but i would make that the default behaviour for any load bearing api)

in an api scenario batch processor with queue could be prefixed with just what model to take as both are present in vram ( i work on that once we get a go ahead or at least a LGTM from the team)

voice could be embedded as tensors on voice upload - and on usege we just pull in the tensor to save computation

atm i support mp3/wav while always converting to wav as a baseline

happy to help out .. but i think api and gradio should be clearly separated .. can someone from zyphra chip in here ?

zaydek · 2025-02-15T10:59:22Z

Just want to mention this thread as relevant for when a teammate comes around to see this PR: #37.

darkacorn · 2025-02-15T11:14:27Z

agreed but that is different as there api has different sampling .. that should be compensate able once we know what they use
the model cond. has params for min p top k / top p / temp and rep_pen .. which are not exposed or used atm in oss only min_p for the time beeing

Ph0rk0z · 2025-02-15T15:11:05Z

With OAI endpoint and speakers from folder as returned voices it would work straight away in sillytavern. Unconditional emotions and it would be good "as-is".

darkacorn · 2025-02-15T15:50:09Z

With OAI endpoint and speakers from folder as returned voices it would work straight away in sillytavern. Unconditional emotions and it would be good "as-is".

pretty much why i proposed it that way .. integration in hundreds of systems would work w/o any extra work

PhialsBasement · 2025-02-16T02:21:27Z

@darkacorn just threw in some of your suggestions, check it out and tell me if its what you were thinking

darkacorn · 2025-02-16T06:10:15Z

amazing thanks for pulling that in, good baseline

ther3zz · 2025-02-16T13:49:31Z

I'm currently testing the openai endpoint, will report back if I run into any issues!
That being said, it makes sense to include a swagger docs endpoint as well (or at least some variable to enable/disable the docs page)

ther3zz · 2025-02-16T17:05:34Z

Has anyone been able to create embeddings? I'm running into this error:

{
    "detail": "'int' object has no attribute 'query'"
}

PhialsBasement · 2025-02-17T00:05:03Z

@ther3zz Fixed. Issue was in api.py, i was tryina use .query() on a CUDA stream handle, now its just a normal UNIX timestamp instead.

ther3zz · 2025-02-17T01:08:11Z

@ther3zz Fixed. Issue was in api.py, i was tryina use .query() on a CUDA stream handle, now its just a normal UNIX timestamp instead.

Looks like it's working!

ther3zz · 2025-02-17T01:16:30Z

Another issue I noticed is that MODEL_CACHE_DIR=/app/models doesnt seem to work. I'm not seeing the models cached there. I see them going here: /root/.cache/huggingface/hub/

PhialsBasement · 2025-02-17T03:10:18Z

Whack, ill look into it and see whats going on there

Ph0rk0z · 2025-02-17T13:36:57Z

Why can't we just load models from a folder we manually saved? I get that huggingface hub is used for docker, but not all of us are doing that.

darkacorn · 2025-02-17T14:02:26Z

i dont think there is anything that prevents it .. you can even use it offline with the hf client

Ph0rk0z · 2025-02-17T22:12:27Z

I've had to change loading to from_local in gradio and all. The from_pretrained is hijacked away from torch.

mathematicalmichael · 2025-02-18T00:24:00Z

hope this helps: HF hub config:
https://huggingface.co/docs/huggingface_hub/package_reference/environment_variables#hfhubcache

PhialsBasement · 2025-02-18T09:31:24Z

@ther3zz can you move this to issues tab over on my fork?

ther3zz · 2025-02-18T12:10:57Z

Another issue I noticed is that MODEL_CACHE_DIR=/app/models doesnt seem to work. I'm not seeing the models cached there. I see them going here: /root/.cache/huggingface/hub/

I don't actually see an issues tab on when on your fork

Update gradio_interface.py

cf0824f

PhialsBasement mentioned this pull request Feb 14, 2025

No API? #72

Open

PhialsBasement added 2 commits February 15, 2025 02:02

Fixed spawn error

36386ef

Added some Stability impovements. Ready for merge when needed.

438a749

PhialsBasement added 3 commits February 16, 2025 13:18

Create api.py

21d38dc

Split API and Gradio

86e3f94

Split into API and UI

d7e5668

PhialsBasement added 2 commits February 16, 2025 13:44

Update api.py

1bff777

Update Dockerfile to accomodate MP3

53e1239

Update api.py

eedcfa8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added API support for local Zonos. #73

Added API support for local Zonos. #73

PhialsBasement commented Feb 14, 2025

darkacorn commented Feb 14, 2025 •

edited

Loading

Steveboy123 commented Feb 14, 2025

darkacorn commented Feb 15, 2025 •

edited

Loading

zaydek commented Feb 15, 2025

darkacorn commented Feb 15, 2025 •

edited

Loading

Ph0rk0z commented Feb 15, 2025

darkacorn commented Feb 15, 2025

PhialsBasement commented Feb 16, 2025

darkacorn commented Feb 16, 2025

ther3zz commented Feb 16, 2025 •

edited

Loading

ther3zz commented Feb 16, 2025

PhialsBasement commented Feb 17, 2025

ther3zz commented Feb 17, 2025

ther3zz commented Feb 17, 2025

PhialsBasement commented Feb 17, 2025

Ph0rk0z commented Feb 17, 2025

darkacorn commented Feb 17, 2025

Ph0rk0z commented Feb 17, 2025

mathematicalmichael commented Feb 18, 2025

PhialsBasement commented Feb 18, 2025

ther3zz commented Feb 18, 2025

Added API support for local Zonos. #73

Are you sure you want to change the base?

Added API support for local Zonos. #73

Conversation

PhialsBasement commented Feb 14, 2025

Add REST API Endpoints

Added Features

Changes

Testing

darkacorn commented Feb 14, 2025 • edited Loading

Steveboy123 commented Feb 14, 2025

darkacorn commented Feb 15, 2025 • edited Loading

zaydek commented Feb 15, 2025

darkacorn commented Feb 15, 2025 • edited Loading

Ph0rk0z commented Feb 15, 2025

darkacorn commented Feb 15, 2025

PhialsBasement commented Feb 16, 2025

darkacorn commented Feb 16, 2025

ther3zz commented Feb 16, 2025 • edited Loading

ther3zz commented Feb 16, 2025

PhialsBasement commented Feb 17, 2025

ther3zz commented Feb 17, 2025

ther3zz commented Feb 17, 2025

PhialsBasement commented Feb 17, 2025

Ph0rk0z commented Feb 17, 2025

darkacorn commented Feb 17, 2025

Ph0rk0z commented Feb 17, 2025

mathematicalmichael commented Feb 18, 2025

PhialsBasement commented Feb 18, 2025

ther3zz commented Feb 18, 2025

darkacorn commented Feb 14, 2025 •

edited

Loading

darkacorn commented Feb 15, 2025 •

edited

Loading

darkacorn commented Feb 15, 2025 •

edited

Loading

ther3zz commented Feb 16, 2025 •

edited

Loading