Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpecAugment? #100

Open
picheny-nyu opened this issue Aug 9, 2021 · 2 comments
Open

SpecAugment? #100

picheny-nyu opened this issue Aug 9, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@picheny-nyu
Copy link

🚀 Feature

Add SpecAugment as a form of audio augmentation.

Motivation

SpecAugment (https://arxiv.org/abs/1904.08779) has resulted in huge improvements in speech recognition performance over the last few years.

Pitch

Any serious audio augmentation toolkit should include SpecAugment as a type of audio augmentation. It has become extremely popular in speech recognition to the point where one wonders about the quality of a research paper that does not use this as standard processing. This, combined with speed and frequency perturbation, has become de rigeur in the speech recognition field. It should be an additional form of processing and accompanied by best practices in applying the technique as there are many variations.

Alternatives

People use time and frequency perturbations by themselves, but when you have a lot of training data, this methodology tends to wash out. SpecAugment improves results even with a lot of training data (at the expense of bigger models).

Additional context

You might also wish to include suggestions for how to integrate AugLy into popular speech recognition toolkits like Kaldi.

@mthrok
Copy link

mthrok commented Aug 9, 2021

note: If you are using PyTorch, then torchaudio has implementation of SpecAugment as TimeStretch, TimeMasking and FrequencyMasking.

@picheny-nyu
Copy link
Author

For sure. But torchaudio also comes with other standard augmentation processes as well, in which case people may not wish to switch between torchaudio and AugLy........

@zpapakipos zpapakipos added the enhancement New feature or request label Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants