Skip to content

Conversation

@rachana192837
Copy link

Summary
Fixes #157 (CUDA OOM on long scripts).
Long-form scripts (>10 min) caused the 1.5B multi-speaker model to crash due to GPU memory limits.

Changes

  • Added long_script_inference.py to split long scripts into smaller chunks.
  • Generates audio sequentially and concatenates outputs.
  • Supports 1.5B multi-speaker model without crashing.

How to Test

  1. Run the 1.5B model with a script longer than 10 minutes using long_script_inference.py.
  2. Confirm audio is generated without CUDA OOM errors.
  3. Adjust CHUNK_SIZE if GPU memory is low.
    Notes
  • This is a workaround to reduce memory usage.

@rachana192837
Copy link
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inference fails with CUDA out of memory on long scripts

2 participants