-
Notifications
You must be signed in to change notification settings - Fork 22
Description
We just finished implementing a more modular pipeline architecture based on modular diffusers so now would be a good time to try implementing a new pipeline. Since RollingForcing is based on Wan2.1 1.3B similar to StreamDiffusionV2 and LongLive, it can be a good candidate.
According to the paper, the main contributions that seem relevant to inference are:
- Joint Denoising where multiple frames are denoised simultaneously with progressive noise levels
- Attention Sink where key/value pairs of initial frames are used as a global context anchor to improve long-term consistency
I believe Attention Sink is also used in LongLive, so Joint Denoising seems to be the main novel thing that is not currently supported at all in Scope. There might be additional things in the reference implementation to consider.
Folder Structure
/pipelines
/rollingforcing
/blocks # block definitions specific to this pipeline
/components
/modules
modular_blocks.py # piece together blocks
pipeline.py # pipeline wrapper around blocks
test.py # test script
model.yaml # config file for model params that don't change at runtime
Blocks
The LongLive blocks can be used as a reference.
The current architecture involves defining the blocks like in the file above and then creating a pipeline class that wraps execution of the blocks like this.
My guess is that the RollingForcing will at the very least need a new JointDenoiseBlock that is substitute for the current DenoiseBlock from the pipelines/wan2_1 directory. This might also have some impact on how the KV cache is cleaned based on what is done in their reference implementation. If this is the case, can either try to tweak CleanKVCacheBlock so it can be re-used in this context or a new block for cleaning the KV cache can be created - the former is preferred if possible!
Components and Modules
Modules are typically re-used from the reference implementation. For the current pipelines we have the main modules that have varied are the ones for the causal model, model (base), and VAE. We can get started by re-using these from the reference implementation and then refine from there once we confirm things are working.
Components are wrappers around modules. At the moment, we usually have a VAE component for each pipeline since the current pipelines have different behavior across VAE wrappers/modules and we've yet to do the exercise of seeing what is shared/different. So, for RollingForcing we can just do the same for now. We use the same generator wrapper for all Wan2.1 based models and just inject the module when instantiating it.
Downloading Model Weights
The download models logic needs to be updated to grab the model weights from HF.
Testing
The main focus should be to get a working pipeline first and the further steps for integrating it into the server and UI can be followed up on separately. In order to confirm it is working there should be a test.py similar to this that can be run with:
uv run -m pipelines.rollingforcing.test