Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training set Size and Duration for video on UCF-101 #52

Closed
flying0712 opened this issue Mar 11, 2025 · 1 comment
Closed

training set Size and Duration for video on UCF-101 #52

flying0712 opened this issue Mar 11, 2025 · 1 comment

Comments

@flying0712
Copy link

Hi, when training on the UCF-101 trainlist01 with 9000 videos each taking 17 frames, it takes 1 hour per epoch and 11 days for 270 epochs on a single machine with 8x V100 GPUs(batch size=1, larger causes GPU memory overflow), which is quite time-consuming. So, could you please share how you set up the the training set size and how long the training duration is( I noticed the paper used 32x GPUs with a batch size of 256 ), and whether you used all video frames or just the first 17? Thank you!

@RobertLuo1
Copy link
Collaborator

RobertLuo1 commented Mar 15, 2025

Hi, thanks for your attention on our work. As indicated in the ucf101_lfqgan_128_L.yaml, we use 64x NPU for training and the global batch size is 128. The total epoch is 2000, which is adopted from the original magvit-v1 paper. The training duration is about 3 days. During training, the 17 video frames are randomly sampled from a given video.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants