Flow matching with Lipman et al. formulation #16

janfb · 2024-04-15T13:46:55Z

janfb
Apr 15, 2024

Hi there,

Thanks for the amazing work on this package 🔥

I am working on an implementation of (unconditional) flow matching using zuko and I have a problem.

I had a look at the implementation of (conditional) flow matching posterior estimation (FMPE) in Lampe

Lines 141 to 143 in 3e88c8e

    
           eps = torch.randn_like(theta) 
        
           theta_prime = (1 - t_) * theta + (t_ + self.eta) * eps 
        
           v = eps - theta

and noticed that it is using a different notation and a different flow matching objective than in the original Lipman et al. paper:

In the initial paper by Lipman et al., the optimal transport flow matching formulation implicitly assumes a Gaussian source x_0 and defines the target distribution as x_1. The resulting formulas for sampling the conditional path and the target vector field are then given as formulas (12) and (13) (here from Tong et al.)

In the Lampe implementation, two things are different:

the objective is actually the independent source conditional flow matching objective formulated in Tong et al., formulas (14) and (15):
the formulation sets x_1 as the source (eps in the code snippet above) and x_0 (theta above) as the target samples.

And this is where my problem comes in: The setup with swapped x_0 and x_1 in Lampe makes sense, because the FreeFormJacobianTransform is set up to match the swapped notation of x_0 being the target:

lampe/lampe/inference/fmpe.py

Lines 95 to 103 in 3e88c8e

    
           return NormalizingFlow( 
        
               transform=FreeFormJacobianTransform( 
        
                   f=lambda t, theta: self(theta, x, t), 
        
                   t0=x.new_tensor(0.0), 
        
                   t1=x.new_tensor(1.0), 
        
                   phi=(x, *self.parameters()), 
        
               ), 
        
               base=DiagNormal(self.zeros, self.ones).expand(x.shape[:-1]), 
        
           )

However, if I want to implement the original Lipman et al. formulation, this setup does not work because x_0 is bound to be Gaussian and only x_1 shows up in the formulas (i.e., I cannot swap x_0 and x_1).

It could well be I am missing something and this is trivial, but: How can I setup the FreeFormJacobianTransform to have x_0 as the source and x_1 as the target?

Answered by francois-rozet

Apr 16, 2024

it seems to me that in line 88 z and x are exchanged?

There is a mistake in the here-above justification! Sorry for the confusion; this "time inversion" is really messing with my head 😅

The CFM loss is actually (see Eq. 14)

$$ || v_t(\psi_t(x_0 \mid x_1)) - \frac{d}{dt} \psi_t(x_0 \mid x_1) ||^2 $$

where

$$ \psi_t(x_0 \mid x_1) = (1 - (1 - \sigma_\min) t) x_0 + t x_1 $$

When we reverse the time ($t \to 1 - t$ and $x_0 \leftrightarrow x_1$), we get

$$ \psi_t(x_1 \mid x_0) = (\sigma_\min + (1 - \sigma_\min) t) x_1 + (1 - t) x_0 $$

and

$$ \frac{d}{dt} \psi_t(x_1 \mid x_0) = (1 - \sigma_\min) x_1 - x_0 $$

So in my previous comment, the objective should be

$$ || v_t(\psi_t(\epsilon \mid \the…

View full answer

francois-rozet · 2024-04-15T15:48:30Z

francois-rozet
Apr 15, 2024
Maintainer

Hello @janfb, I have split my answer into two parts. Tell me if anything is not clear!

Also you might want to check this 100 lines of code implementation of flow matching.

Flow matching objective

I am not sure to understand the notations of the screenshots. $p(x | z)$ not depending on $z$ seems weird. What LAMPE implements is the conditional flow matching (CFM) loss from Lipman et al. (2023).

However, as you note, we reverse the time axis ($t \to 1 - t$) such that $x_0$ is $\theta$ and $x_1$ is some noise $\epsilon \sim \mathcal{N}(0, I)$.

Doing so, we get

$$ \psi_t = (\sigma_\min + t (1 - \sigma_\min)) \epsilon + (1 - t) \theta $$

and the objective

$$ || v_t(\psi_t) - (\epsilon - (1 - \sigma_\min) \theta) ||^2 $$

In addition, we assume that $\sigma_\min$ (eta in our code) is negligible with respect to $1$, and we replace $1 - \sigma_\min$ by $1$. I would argue that we should not have made this assumption as it makes our implementation different from the original paper, but we found that it had little to no impact in practice.

This justification is partially incorrect. See this comment for a better answer.

Free-form Jacobian transform

Actually, FreeFormJacobianTransform does not assume that $x_0$ is a target or a source.

It defines a transformation that is the integration of a vector field $f(t, x(t))$ from an initial time $t_0$ to a final time $t_1$. $t_0$ and $t_1$ can be anything, notably $t_0 = 1$ and $t_1 = 0$ are valid!

Hence, if you have a network that approximates a flow-matching vector field $v_t(t, x(t))$, and $x(1)$ is the transformation's input and $x(0)$ is the transformation's output (note that $x_0 = x(t_0) \neq x(0)$ in the documentation), then you want

transform = FreeFormJacobianTransform( 
    f=lambda t, x: network(t, x),
    t0=x.new_tensor(1.0),  # t_0 = 1 /!\
    t1=x.new_tensor(0.0), 
    phi=network.parameters(),
)

By the way, to inverse this transformation you can simply swap $t_0$ and $t_1$!

Note that using a proper ODE integrator is usually not the best (most efficient) way to implement the sampling of diffusion/score-based/flow-matching models. Fixed-step Euler is often enough.

0 replies

janfb · 2024-04-16T14:40:59Z

janfb
Apr 16, 2024
Author

Hello @francois-rozet

thanks a lot for the fast response and for sharing your code snippet!

Flow matching objective

Oh wow, I see it now. Interesting that your assumptions then result in a formulation that very much resembles the independent source flow matching objective from Tong et al.. (Regarding z, there was some context missing - sorry. Here z = (x_0, x_1).

It's likely that I am again missing something, but in your code snippet linked above, it seems to me that in line 88 z and x are exchanged? (in the notation of your screenshots, I assume eps=z and theta=x). Shoudn't it be

u = z - (1 - 1e-4) * x

?

Free-form Jacobian transform

Yes, that makes sense - thanks!

2 replies

francois-rozet Apr 16, 2024
Maintainer

it seems to me that in line 88 z and x are exchanged?

There is a mistake in the here-above justification! Sorry for the confusion; this "time inversion" is really messing with my head 😅

The CFM loss is actually (see Eq. 14)

$$ || v_t(\psi_t(x_0 \mid x_1)) - \frac{d}{dt} \psi_t(x_0 \mid x_1) ||^2 $$

where

$$ \psi_t(x_0 \mid x_1) = (1 - (1 - \sigma_\min) t) x_0 + t x_1 $$

When we reverse the time ($t \to 1 - t$ and $x_0 \leftrightarrow x_1$), we get

$$ \psi_t(x_1 \mid x_0) = (\sigma_\min + (1 - \sigma_\min) t) x_1 + (1 - t) x_0 $$

and

$$ \frac{d}{dt} \psi_t(x_1 \mid x_0) = (1 - \sigma_\min) x_1 - x_0 $$

So in my previous comment, the objective should be

$$ || v_t(\psi_t(\epsilon \mid \theta)) - ((1 - \sigma_\min) \epsilon - \theta) ||^2 $$

which is consistent with the code snippet. The error in my previous comment was to replace $x_0$ by $\theta$ and $x_1$ by $\epsilon$ in the objective without taking into account the sign change of $\frac{d}{dt} \psi_t$ when the time is reversed.

Answer selected by francois-rozet

janfb Apr 16, 2024
Author

Yes, same here. Now it makes sense and also my re-implementation works like a charm. Thanks a lot for clarifying! 🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flow matching with Lipman et al. formulation #16

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Flow matching with Lipman et al. formulation #16

janfb Apr 15, 2024

Replies: 2 comments · 2 replies

francois-rozet Apr 15, 2024 Maintainer

Flow matching objective

Free-form Jacobian transform

janfb Apr 16, 2024 Author

Flow matching objective

Free-form Jacobian transform

francois-rozet Apr 16, 2024 Maintainer

janfb Apr 16, 2024 Author

janfb
Apr 15, 2024

Replies: 2 comments 2 replies

francois-rozet
Apr 15, 2024
Maintainer

janfb
Apr 16, 2024
Author

francois-rozet Apr 16, 2024
Maintainer

janfb Apr 16, 2024
Author