Concatonation of partial reference audio into from of generated audio.

I have installed voicebox via Pinokio and it seems to be working (mostly).   Mostly since I have found one combination of mp3 with reference text that seems to cause the app to take what seems to be the last sentance off the reference audio and plug that in front of the generated speech.  The cadence is different so it seems to concatenate therefore the text not audio from the Voice and regenerate the text using the model of the voice.  It seems therefore since the reference voice import tends to clip at 15 seconds (and that seemed sit in a convenient pause in the reference audio so had me scratching me head a little bit LOL).  I fixed it by deleting the voice and re-creating it with only the text that covered the 15 snipped 15 seconds.  Not sure why the implementation does that, seems odd.  In any event, not something you can't get around.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concatonation of partial reference audio into from of generated audio. #383

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Concatonation of partial reference audio into from of generated audio. #383

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions