Multimodal Learning 相关度: 8/10

Flow caching for autoregressive video generation

Yuexiao Ma, Xuzhe Zheng, Jing Xu, Xiwei Xu, Feng Ling, Xiawu Zheng, Huafeng Kuang, Huixia Li, Xing Wang, Xuefeng Xiao, Fei Chao, Rongrong Ji
arXiv: 2602.10825v1 发布: 2026-02-11 更新: 2026-02-11

AI 摘要

FlowCache提出了一种针对自回归视频生成的缓存框架,显著加速视频生成。

主要贡献

  • 提出了针对自回归视频生成的FlowCache缓存框架
  • 引入了chunkwise缓存策略,动态适应每个chunk的denoising特性
  • 提出了importance-redundancy优化的KV缓存压缩机制,保持生成质量

方法论

FlowCache采用chunkwise缓存策略和KV缓存压缩机制,针对自回归模型的特性进行优化,提升视频生成速度。

原文摘要

Autoregressive models, often built on Transformer architectures, represent a powerful paradigm for generating ultra-long videos by synthesizing content in sequential chunks. However, this sequential generation process is notoriously slow. While caching strategies have proven effective for accelerating traditional video diffusion models, existing methods assume uniform denoising across all frames-an assumption that breaks down in autoregressive models where different video chunks exhibit varying similarity patterns at identical timesteps. In this paper, we present FlowCache, the first caching framework specifically designed for autoregressive video generation. Our key insight is that each video chunk should maintain independent caching policies, allowing fine-grained control over which chunks require recomputation at each timestep. We introduce a chunkwise caching strategy that dynamically adapts to the unique denoising characteristics of each chunk, complemented by a joint importance-redundancy optimized KV cache compression mechanism that maintains fixed memory bounds while preserving generation quality. Our method achieves remarkable speedups of 2.38 times on MAGI-1 and 6.7 times on SkyReels-V2, with negligible quality degradation (VBench: 0.87 increase and 0.79 decrease respectively). These results demonstrate that FlowCache successfully unlocks the potential of autoregressive models for real-time, ultra-long video generation-establishing a new benchmark for efficient video synthesis at scale. The code is available at https://github.com/mikeallen39/FlowCache.

标签

视频生成 自回归模型 缓存策略 加速

arXiv 分类

cs.CV cs.AI