You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
As mentioned in the README, GigaSpeech contains "33,000+ hours for unsupervised/semi-supervised learning". I am trying to use these unlabeled data, and I have already downloaded the XL subset. But after I summed up the duration of each audio in GigaSpeech.json, the number is only around 25000 hour.
So my question is, is the entire XL subset the 33,000 hour data, or are there any additional steps needed to retrieve the 33000 hour data?
Many thanks!
The text was updated successfully, but these errors were encountered:
Hi,
As mentioned in the README, GigaSpeech contains "33,000+ hours for unsupervised/semi-supervised learning". I am trying to use these unlabeled data, and I have already downloaded the XL subset. But after I summed up the
duration
of each audio inGigaSpeech.json
, the number is only around 25000 hour.So my question is, is the entire XL subset the 33,000 hour data, or are there any additional steps needed to retrieve the 33000 hour data?
Many thanks!
The text was updated successfully, but these errors were encountered: