Multimodal Learning 相关度: 9/10

DreamPartGen: Semantically Grounded Part-Level 3D Generation via Collaborative Latent Denoising

Tianjiao Yu, Xinzhuo Li, Muntasir Wahed, Jerry Xiong, Yifan Shen, Ying Shen, Ismini Lourentzou
arXiv: 2603.19216v1 发布: 2026-03-19 更新: 2026-03-19

AI 摘要

DreamPartGen提出一种语义驱动的、部件感知的文本到3D生成框架,实现高质量的3D物体生成。

主要贡献

  • 引入Duplex Part Latents (DPLs) 联合建模部件几何和外观
  • 引入Relational Semantic Latents (RSLs) 捕捉部件间依赖关系
  • 提出了同步协同去噪过程,保证几何和语义一致性

方法论

使用DPLs和RSLs建模部件信息,通过协同去噪保证几何和语义一致,最终生成可解释的3D模型。

原文摘要

Understanding and generating 3D objects as compositions of meaningful parts is fundamental to human perception and reasoning. However, most text-to-3D methods overlook the semantic and functional structure of parts. While recent part-aware approaches introduce decomposition, they remain largely geometry-focused, lacking semantic grounding and failing to model how parts align with textual descriptions or their inter-part relations. We propose DreamPartGen, a framework for semantically grounded, part-aware text-to-3D generation. DreamPartGen introduces Duplex Part Latents (DPLs) that jointly model each part's geometry and appearance, and Relational Semantic Latents (RSLs) that capture inter-part dependencies derived from language. A synchronized co-denoising process enforces mutual geometric and semantic consistency, enabling coherent, interpretable, and text-aligned 3D synthesis. Across multiple benchmarks, DreamPartGen delivers state-of-the-art performance in geometric fidelity and text-shape alignment.

标签

文本到3D 3D生成 部件感知 语义理解

arXiv 分类

cs.CV cs.AI cs.LG