Coupled Inference in Diffusion Models for Semantic Decomposition
AI 摘要
提出基于扩散模型的耦合推理框架,用于语义分解任务,优于传统谐振器网络。
主要贡献
- 提出基于扩散模型的语义分解框架
- 引入重建驱动的引导项耦合扩散过程
- 提出新的迭代采样方案
方法论
将语义分解视为逆问题,通过耦合的扩散过程,利用重建误差作为指导,迭代采样提升性能。
原文摘要
Many visual scenes can be described as compositions of latent factors. Effective recognition, reasoning, and editing often require not only forming such compositional representations, but also solving the decomposition problem. One popular choice for constructing these representations is through the binding operation. Resonator networks, which can be understood as coupled Hopfield networks, were proposed as a way to perform decomposition on such bound representations. Recent works have shown notable similarities between Hopfield networks and diffusion models. Motivated by these observations, we introduce a framework for semantic decomposition using coupled inference in diffusion models. Our method frames semantic decomposition as an inverse problem and couples the diffusion processes using a reconstruction-driven guidance term that encourages the composition of factor estimates to match the bound vector. We also introduce a novel iterative sampling scheme that improves the performance of our model. Finally, we show that attention-based resonator networks are a special case of our framework. Empirically, we demonstrate that our coupled inference framework outperforms resonator networks across a range of synthetic semantic decomposition tasks.