Skip to content

Conversation

@aartilalwani
Copy link

@aartilalwani aartilalwani commented Sep 5, 2025

DiT and VAE changes for LTX inference pipeline, will add more optimizations in upcoming PRs including full pipeline

@SolitaryThinker SolitaryThinker added the go Trigger Buildkite CI label Sep 5, 2025
@aartilalwani aartilalwani marked this pull request as ready for review September 30, 2025 02:56
@aartilalwani aartilalwani marked this pull request as draft September 30, 2025 02:58
@aartilalwani aartilalwani marked this pull request as ready for review October 31, 2025 02:55
@SolitaryThinker
Copy link
Collaborator

Could you run pre-commit on this PR?

pre-commit install --hook-type pre-commit --hook-type commit-msg

# You can manually run pre-commit with
pre-commit run --all-files

@SolitaryThinker SolitaryThinker self-requested a review November 14, 2025 20:12
)

generator = VideoGenerator.from_pretrained(
model_path="data/Lightricks/LTX-Video",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VideoGenerator.from_pretrained() should download the model for you, meaning directly passing Lightricks/LTX-Video as the model_path should work. Could you test and simply this example accordingly? thanks

Comment on lines +30 to +44
# TODO: fix all of the configs so it's exact match

# # Text encoder configuration
# text_encoder_configs: tuple[EncoderConfig, ...] = field(
# # todo: set max length later
# #def ltx_t5_config():
# # config = T5Config()
# # config.tokenizer_kwargs["max_length"] = 128
# # return config

# # @dataclass
# # class LTXConfig(PipelineConfig):
# # text_encoder_configs: tuple[EncoderConfig, ...] = field(
# # default_factory=lambda: (ltx_t5_config(), ))
# default_factory=lambda: (T5Config(), ))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these not needed? if so please remove

Comment on lines +104 to +120
# TODO: load differently for each config
# Text-to-Video: Only needs the decoder (to decode latents to video)
# Image-to-Video: Needs both encoder (to encode input image) and decoder
# @dataclass
# class LTXT2VConfig(LTXConfig):
# def __post_init__(self):
# super().__post_init__()
# self.vae_config.load_encoder = False
# self.vae_config.load_decoder = True

# @dataclass
# class LTXI2VConfig(LTXConfig):
# def __post_init__(self):
# super().__post_init__()
# self.vae_config.load_encoder = True
# self.vae_config.load_decoder = True

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, please clean up comments

Comment on lines +8 to +14
class LTXSamplingParam(SamplingParam):
# Video parameters
height: int = 512
width: int = 704

# Most defaults set in pipeline config
num_inference_steps: int = 50
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the default number of frames that the official repo generates? Could you add it here as well?

from diffusers.utils.torch_utils import maybe_allow_in_graph
# from ..attention import FeedForward
from fastvideo.attention import DistributedAttention, LocalAttention
#from diffusers.attention_processor import Attention
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean up comments please

Comment on lines +46 to +51
# Add ImageVAEEncodingStage for I2V (conditional based on input)
# Before LatentPreparation for I2V
# if fastvideo_args.pipeline_config.ltx_i2v_mode:
# self.add_stage(
# stage_name="image_vae_encoding_stage",
# stage=LTXImageVAEEncodingStage(vae=self.get_module("vae")))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove if not needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go Trigger Buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants