LightX2V is a lightweight video generation inference framework designed to provide an inference tool that leverages multiple advanced video generation inference techniques. As a unified inference platform, this framework supports various generation tasks such as text-to-video (T2V) and image-to-video (I2V) across different models. X2V means transforming different input modalities (such as text or images) to video output.
Please refer to our documentation: English Docs | 中文文档.
- ✅ HunyuanVideo-T2V
- ✅ HunyuanVideo-I2V
- ✅ Wan2.1-T2V
- ✅ Wan2.1-I2V
- ✅ Wan2.1-T2V-StepDistill-CfgDistill (recommended 🚀🚀🚀)
- ✅ Wan2.1-T2V-CausVid
- ✅ SkyReels-V2-DF
- ✅ CogVideoX1.5-5B-T2V
We have prepared a pre-commit hook to enforce consistent code formatting across the project.
Tip
- Install the required dependencies:
pip install ruff pre-commit
- Then, run the following command before commit:
pre-commit run --all-files
Thank you for your contributions!
We built the code for this repository by referencing the code repositories involved in all the models mentioned above.
If you find our framework useful to your research, please kindly cite our work:
@misc{lightx2v,
author = {lightx2v contributors},
title = {LightX2V: Light Video Generation Inference Framework},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/ModelTC/lightx2v}},
}