Multimodal Learning 相关度: 8/10

DressWild: Feed-Forward Pose-Agnostic Garment Sewing Pattern Generation from In-the-Wild Images

Zeng Tao, Ying Jiang, Yunuo Chen, Tianyi Xie, Huamin Wang, Yingnian Wu, Yin Yang, Abishek Sampath Kumar, Kenji Tashiro, Chenfanfu Jiang
arXiv: 2602.16502v1 发布: 2026-02-18 更新: 2026-02-18

AI 摘要

DressWild提出了一种从单张自然图像生成服装缝纫图案和3D模型的feed-forward方法。

主要贡献

  • 提出DressWild,一个高效的服装图案生成pipeline
  • 利用视觉语言模型(VLMs)解决姿势变化问题
  • 实现了从单张图像生成可用于物理仿真的服装图案

方法论

利用VLM进行姿势归一化,提取姿势感知和3D信息服装特征,通过Transformer预测缝纫图案参数。

原文摘要

Recent advances in garment pattern generation have shown promising progress. However, existing feed-forward methods struggle with diverse poses and viewpoints, while optimization-based approaches are computationally expensive and difficult to scale. This paper focuses on sewing pattern generation for garment modeling and fabrication applications that demand editable, separable, and simulation-ready garments. We propose DressWild, a novel feed-forward pipeline that reconstructs physics-consistent 2D sewing patterns and the corresponding 3D garments from a single in-the-wild image. Given an input image, our method leverages vision-language models (VLMs) to normalize pose variations at the image level, then extract pose-aware, 3D-informed garment features. These features are fused through a transformer-based encoder and subsequently used to predict sewing pattern parameters, which can be directly applied to physical simulation, texture synthesis, and multi-layer virtual try-on. Extensive experiments demonstrate that our approach robustly recovers diverse sewing patterns and the corresponding 3D garments from in-the-wild images without requiring multi-view inputs or iterative optimization, offering an efficient and scalable solution for realistic garment simulation and animation.

标签

服装建模 缝纫图案生成 视觉语言模型 3D重建

arXiv 分类

cs.CV