Road to Flow Matching
Introduction
Generative models aim to transform a simple base distribution, such as a Gaussian, into a complex target data distribution. The core idea is to construct a generator that maps samples from the base distribution into data space. To compute likelihoods, this generator must be invertible. If a generator transforms into , then must exist, and the density is computed using the change of variables:
or equivalently:
Stacking invertible generators gives:
The main difficulty is designing expressive yet tractable invertible transformations with efficiently computable Jacobian determinants.
Residual Flows: Discrete-Time Transformations
A residual flow layer has the simple form:
To make this invertible, must be a contractive mapping so that the fixed point equation:
has a unique solution. This allows the layer to be inverted by iterative fixed-point updates.
The Jacobian determinant is:
Direct determinant computation is expensive. A common trick is estimating traces using Monte Carlo:
with sampled from a standard Gaussian.
Stacking residual layers produces:
which gradually warps the simple base distribution into a complex target distribution.
Continuous Normalizing Flows (CNF)
Taking the limit of infinitely many residual layers transforms discrete updates into an ODE. From:
letting the step size go to zero yields:
This defines a Neural ODE where is a time-varying vector field.
Density Evolution via the Continuity Equation
The probability density evolves according to:
This describes conservation of probability mass as samples move under the vector field.
Training Difficulty
Training CNFs requires solving ODEs inside the likelihood objective for every data point, making CNFs computationally expensive and slow to scale.
Flow Matching: Scalable CNF Training
Flow Matching avoids ODE integration by directly learning the vector field .
The Flow Matching Objective
Define the ideal objective:
This trains to match the true vector field . However, both and are unknown.
Conditional Flow Matching
The key idea is to design conditional probability paths that are analytically tractable. Sampling over these conditional paths recovers the marginal objective in expectation.
A common conditional path:
with , , , , smoothly interpolating from noise to the data point .
For this path, the conditional vector field:
A simpler special case is the rectified flow path:
where is sampled from the base distribution. Then the target vector field is constant:
Conditional Flow Matching Objective
A theorem states:
where is independent of . Minimizers of both objectives are identical, so training with the conditional loss recovers the true flow.
Training Loop
- Sample a data point .
- Sample a time uniformly in .
- Sample from .
- Compute .
- Train using the L2 loss.
- Update parameters with gradient descent.
No ODE solving is required during training.
Comparison with Diffusion Models
Flow Matching generalizes diffusion models:
- It does not require the forward process to be a fixed Gaussian diffusion.
- It directly designs a path from noise to data.
- It avoids computing or simulating a forward diffusion chain.
- The training objective is a simple regression problem.
Diffusion models produce samples by reversing a stochastic diffusion process, while Flow Matching learns a continuous vector field mapping noise to data directly.
Conclusion
Residual flows introduced discrete invertible transformations. CNFs extended these to continuous time, governed by ODEs and the continuity equation. However, CNFs are computationally expensive to train because they require ODE integration inside the loss.
Flow Matching resolves this by replacing likelihood-based training with a tractable conditional regression objective that teaches the model to approximate the true vector field. This preserves the expressive power of continuous flows while greatly improving scalability and efficiency. As a result, Flow Matching has become a foundational framework for high-resolution generative modeling.
