Refine VAD segmentation in short silences #20

ISzoke · 2024-01-29T14:32:01Z

Now, the dataset splitter splits data according to VAD settings which can produce long segments (>30s for example).
The postprocessing splits these to 30s sharp, which ends up in split in speech.

We need update to split in some small silence close to the 30s.

It can be done on the level of data builder (GPU accelerated) or on the level of trainer transformation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine VAD segmentation in short silences #20

Refine VAD segmentation in short silences #20

ISzoke commented Jan 29, 2024

Refine VAD segmentation in short silences #20

Refine VAD segmentation in short silences #20

Comments

ISzoke commented Jan 29, 2024