Multimodal Learning 相关度: 6/10

Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling

Aram Davtyan, Leello Tadesse Dadi, Volkan Cevher, Paolo Favaro
arXiv: 2603.15279v1 发布: 2026-03-16 更新: 2026-03-16

AI 摘要

LOOM-CFM通过跨minibatch优化数据噪声耦合,加速Flow-Based生成模型的推理。

主要贡献

  • 提出LOOM-CFM方法,扩展minibatch OT的范围
  • 提升Flow-Based生成模型采样速度-质量权衡
  • 增强蒸馏初始化并支持高分辨率潜在空间合成

方法论

LOOM-CFM通过在训练过程中保留和优化跨minibatch的分配关系,扩展minibatch最优传输的范围,从而加速推理。

原文摘要

Conditional Flow Matching (CFM), a simulation-free method for training continuous normalizing flows, provides an efficient alternative to diffusion models for key tasks like image and video generation. The performance of CFM in solving these tasks depends on the way data is coupled with noise. A recent approach uses minibatch optimal transport (OT) to reassign noise-data pairs in each training step to streamline sampling trajectories and thus accelerate inference. However, its optimization is restricted to individual minibatches, limiting its effectiveness on large datasets. To address this shortcoming, we introduce LOOM-CFM (Looking Out Of Minibatch-CFM), a novel method to extend the scope of minibatch OT by preserving and optimizing these assignments across minibatches over training time. Our approach demonstrates consistent improvements in the sampling speed-quality trade-off across multiple datasets. LOOM-CFM also enhances distillation initialization and supports high-resolution synthesis in latent space training.

标签

Flow-Based Generative Models Conditional Flow Matching Optimal Transport Inference Speed

arXiv 分类

cs.LG cs.CV