Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiments with HunyuanVideo model? #5

Open
anm-ol opened this issue Feb 19, 2025 · 2 comments
Open

Experiments with HunyuanVideo model? #5

anm-ol opened this issue Feb 19, 2025 · 2 comments

Comments

@anm-ol
Copy link

anm-ol commented Feb 19, 2025

Hi, this is a very interesting paper and thank you for providing the code!

I wanted to ask whether you have tested your method with HunyuanVideo, as it is one of the state-of-the-art open-source video generation models.

  • If you have used it, could you share any insights or results from your experiments?
  • If not, was there a specific reason for not including it (e.g., performance issues, compatibility constraints, or other limitations)?
  • What kind of difference do you see with U-net vs DiT based models (say AnimateDiff vs CogVideoX-2B) is the increased VRAM usage justified in the better results?
@YujieOuO
Copy link
Collaborator

Thanks for your attention.

  1. Now, we release the Light-A-Video with Wan2.1 backbone. Wan2.1 is one of the best DiT-based video foundation model.
  2. For a resolution of 512×512, AnimateDiff requires approximately 23 GB of GPU memory to generate a 16-frame video. It is more suitable for consumer-grade GPUs, such as the RTX 3090. In contrast, Wan2.1 consumes around 36 GB of GPU memory to generate a 49-frame video at the same resolution.
  3. Actually, for the relighting quality, AnimateDiff may be more suitable as the vdm backbone. This is because the current IC-Light model we are using is based on a U-Net architecture, which is more aligned with the Stable Diffusion model used in AnimateDiff. This alignment ensures greater consistency and coherence in the relighting process.
  4. DiT-based VDM preserves finer details and supports longer, more diverse resolutions.

@YujieOuO
Copy link
Collaborator

bear.mp4

relight_prompt: "a bear walking on the rock, nature lighting, soft light"
bg_source: "TOP"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants