Multimodal Learning 相关度: 7/10

UniStitch: Unifying Semantic and Geometric Features for Image Stitching

Yuan Mei, Lang Nie, Kang Liao, Yunqiu Xu, Chunyu Lin, Bin Xiao
arXiv: 2603.10568v1 发布: 2026-03-11 更新: 2026-03-11

AI 摘要

UniStitch统一几何和语义特征,用于提升图像拼接性能。

主要贡献

  • 提出Neural Point Transformer (NPT) 模块
  • 设计Adaptive Mixture of Experts (AMoE) 模块
  • 构建统一的图像拼接框架 UniStitch

方法论

利用NPT将几何特征转换为语义图,AMoE融合两种特征,用于深度拼接管道。

原文摘要

Traditional image stitching methods estimate warps from hand-crafted geometric features, whereas recent learning-based solutions leverage semantic features from neural networks instead. These two lines of research have largely diverged along separate evolution, with virtually no meaningful convergence to date. In this paper, we take a pioneering step to bridge this gap by unifying semantic and geometric features with UniStitch, a unified image stitching framework from multimodal features. To align discrete geometric features (i.e., keypoint) with continuous semantic feature maps, we present a Neural Point Transformer (NPT) module, which transforms unordered, sparse 1D geometric keypoints into ordered, dense 2D semantic maps. Then, to integrate the advantages of both representations, an Adaptive Mixture of Experts (AMoE) module is designed to fuse geometric and semantic representations. It dynamically shifts focus toward more reliable features during the fusion process, allowing the model to handle complex scenes, especially when either modality might be compromised. The fused representation can be adopted into common deep stitching pipelines, delivering significant performance gains over any single feature. Experiments show that UniStitch outperforms existing state-of-the-art methods with a large margin, paving the way for a unified paradigm between traditional and learning-based image stitching.

标签

图像拼接 语义特征 几何特征 多模态学习

arXiv 分类

cs.CV