Multimodal Learning 相关度: 9/10

Towards Generalized Multimodal Homography Estimation

Jinkun You, Jiaxin Cheng, Jie Zhang, Yicong Zhou
arXiv: 2603.03956v1 发布: 2026-03-04 更新: 2026-03-04

AI 摘要

提出一种新的多模态单应性估计方法,通过合成数据和网络设计增强泛化能力。

主要贡献

  • 提出一种新的训练数据合成方法
  • 设计一种新的网络结构利用跨尺度信息并解耦颜色信息
  • 提高了跨模态的单应性估计泛化性能

方法论

通过合成无对齐图像对进行训练,网络结构利用跨尺度信息并解耦颜色信息以提高精度。

原文摘要

Supervised and unsupervised homography estimation methods depend on image pairs tailored to specific modalities to achieve high accuracy. However, their performance deteriorates substantially when applied to unseen modalities. To address this issue, we propose a training data synthesis method that generates unaligned image pairs with ground-truth offsets from a single input image. Our approach renders the image pairs with diverse textures and colors while preserving their structural information. These synthetic data empower the trained model to achieve greater robustness and improved generalization across various domains. Additionally, we design a network to fully leverage cross-scale information and decouple color information from feature representations, thus improving estimation accuracy. Extensive experiments show that our training data synthesis method improves generalization performance. The results also confirm the effectiveness of the proposed network.

标签

多模态学习 单应性估计 数据合成 泛化

arXiv 分类

cs.CV cs.AI