CUDA out of memory. Tried to allocate 91.71 GiB. #4

Aniruddh-J · 2024-11-17T23:27:05Z

Is it not possible to transcribe long audio files, around ~3 hours? I am trying to transcribe the 3-hour audio to Hindi, but it uses huge memory.

import torch
import nemo.collections.asr as nemo_asr

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
nm_model = nemo_asr.models.EncDecCTCModel.from_pretrained('ai4bharat/indicconformer_stt_hi_hybrid_rnnt_large')
nm_model.freeze() # inference mode
nm_model = nm_model.to(device)

nm_model.cur_decoder = 'rnnt'
text = nm_model.transcribe(audio=str(processed_audio_file), batch_size=1, language_id='hi')[0]

The memory usage is huuuuge.

The text was updated successfully, but these errors were encountered:

ryback123 · 2024-11-18T06:18:36Z

You can try two things:

In the model config, set the self_attention_model in NeMo to local, and set a smaller local_attention_window.
Chunk the audio into multiple segments using Silero VAD, run inference on the individual chunks, and finally merge the output transcripts.

The second option might give a better result since the model has been trained on 5 - 25 second audio chunks, so it would most accurate on audio files having a duration within that range.

Aniruddh-J · 2024-11-18T21:32:33Z

Fantastic. I am going ahead with the audio chunk route. Besides, is it possible to turn off tqdm logging while transcribing? I have my own progress for chunks using tqdm.

For now, I am using suppresser class to suppress NeMo outputs:

class SuppressNeMo:
    def __enter__(self):
        self._original_stderr = sys.stderr
        sys.stderr = open(os.devnull, "w") 
    def __exit__(self, exc_type, exc_value, traceback):
        sys.stderr.close()
        sys.stderr = self._original_stderr

Aniruddh-J changed the title ~~CUDA out of memory. Tried to allocate 91.71 GiB. GPU~~ CUDA out of memory. Tried to allocate 91.71 GiB. Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA out of memory. Tried to allocate 91.71 GiB. #4

CUDA out of memory. Tried to allocate 91.71 GiB. #4

Aniruddh-J commented Nov 17, 2024 •

edited

Loading

ryback123 commented Nov 18, 2024

Aniruddh-J commented Nov 18, 2024 •

edited

Loading

CUDA out of memory. Tried to allocate 91.71 GiB. #4

CUDA out of memory. Tried to allocate 91.71 GiB. #4

Comments

Aniruddh-J commented Nov 17, 2024 • edited Loading

ryback123 commented Nov 18, 2024

Aniruddh-J commented Nov 18, 2024 • edited Loading

Aniruddh-J commented Nov 17, 2024 •

edited

Loading

Aniruddh-J commented Nov 18, 2024 •

edited

Loading