Integrate faster models #36

jsboige · 2023-09-09T15:37:04Z

The large model currently brings good results, but it seems faster versions have emerged here and there, typically through quantization, with similar results, faster inference and a much lower memory footprint.
See for instance:

What would it take to support some of those?

jhj0517 · 2023-09-09T15:50:43Z

Hi @jsboige.
Thanks for the introduction.
I'm currently thinking about integrating this, the first of what you listed.

https://github.com/guillaumekln/faster-whisper

According to the repo, it reduces the VRAM usage to 12GB -> 4GB on large-v2 model as well as reducing the time consuming.
Maybe this would be the easiest way to impelement "faster whisper".

I guess I should add an command line argument for faster-whisper and impelement it.

jsboige · 2023-09-09T21:02:09Z

That would be great !

jhj0517 · 2023-09-10T09:59:50Z

faster-whisper implemented in #37,
It reduced the time consumption from 4 minutes 8 seconds -> 3 minutes 53 seconds for a 30 minute audio file.
I'm not sure if this efficiency is right or not, and I only compared one file ( Korean audio file ), so it could be wrong, but it would probably be useful for reducing VRAM usage.

jsboige · 2023-09-10T14:02:15Z

faster-whisper implemented in #37,

Thanks, I could test it with success on French songs, but language selection does not seem to work (I had to use auto-detection)
I got the following error ("fr" is expected instead of "french")

Error transcribing file on line french is not a valid language code

It reduced the time consumption from 4 minutes 8 seconds -> 3 minutes 53 seconds for a 30 minute audio file. I'm not sure if this efficiency is right or not, and I only compared one file ( Korean audio file ), so it could be wrong, but it would probably be useful for reducing VRAM usage.

That looks underwhelming indeed. Anyway it's definitely using less VRAM, and I believe I got a perceptible performance gain for small songs' mp3s.

jhj0517 · 2023-09-10T16:18:30Z

language selection does not seem to work (I had to use auto-detection)
I got the following error ("fr" is expected instead of "french")

Thanks for pointing this out. fixed in # 6726c6a .

guillaumekln · 2023-09-11T08:54:55Z

You set a default beam size of 5 for faster-whisper:

Whisper-WebUI/modules/faster_whisper_inference.py

Line 27 in 6726c6a

self.default_beam_size = 5

but you don't set the same beam size in openai-whisper (which is 1 by default). You should set the same beam size when comparing the transcription time.

jhj0517 · 2023-09-11T10:44:14Z

@guillaumekln Thanks! You're right. I didn't read it correctly.
I compared again with the same file, and the time reduction efficiency is much better.
The processing time was reduced from 4 minutes 8 seconds -> 2 minutes 33 seconds for a 30-minute audio file, when both beam sizes are set to 1.
Thanks for informing me. Also, thanks for creating faster-whisper; it has helped my project a lot.

By the way, beam_size should be a tunable parameter. I'm thinking of creating a collapsible "Advanced Parameters" tab to include it.
There could also be logprob_threshold and no_speech_threshold etc.

jhj0517 · 2024-04-07T05:58:38Z

resolved with faster-whipser

jhj0517 added the enhancement New feature or request label Sep 9, 2023

jhj0517 closed this as completed Apr 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate faster models #36

Integrate faster models #36

jsboige commented Sep 9, 2023

jhj0517 commented Sep 9, 2023 •

edited

Loading

jsboige commented Sep 9, 2023

jhj0517 commented Sep 10, 2023

jsboige commented Sep 10, 2023

jhj0517 commented Sep 10, 2023

guillaumekln commented Sep 11, 2023

jhj0517 commented Sep 11, 2023 •

edited

Loading

jhj0517 commented Apr 7, 2024

Integrate faster models #36

Integrate faster models #36

Comments

jsboige commented Sep 9, 2023

jhj0517 commented Sep 9, 2023 • edited Loading

jsboige commented Sep 9, 2023

jhj0517 commented Sep 10, 2023

jsboige commented Sep 10, 2023

jhj0517 commented Sep 10, 2023

guillaumekln commented Sep 11, 2023

jhj0517 commented Sep 11, 2023 • edited Loading

jhj0517 commented Apr 7, 2024

jhj0517 commented Sep 9, 2023 •

edited

Loading

jhj0517 commented Sep 11, 2023 •

edited

Loading