Multimodal Learning 相关度: 8/10

Geometry of Uncertainty: Learning Metric Spaces for Multimodal State Estimation in RL

Alfredo Reichlin, Adriano Pacciarelli, Danica Kragic, Miguel Vasco
arXiv: 2602.12087v1 发布: 2026-02-12 更新: 2026-02-12

AI 摘要

提出一种新型强化学习状态估计方法,通过学习度量空间提升多模态信息融合的鲁棒性。

主要贡献

  • 提出基于度量空间的无显式概率建模的状态估计方法
  • 引入多模态隐变量转移模型
  • 提出基于反距离权重的传感器融合机制

方法论

通过学习状态间的度量空间,将状态估计问题转化为几何问题,并使用多模态模型和传感器融合提升鲁棒性。

原文摘要

Estimating the state of an environment from high-dimensional, multimodal, and noisy observations is a fundamental challenge in reinforcement learning (RL). Traditional approaches rely on probabilistic models to account for the uncertainty, but often require explicit noise assumptions, in turn limiting generalization. In this work, we contribute a novel method to learn a structured latent representation, in which distances between states directly correlate with the minimum number of actions required to transition between them. The proposed metric space formulation provides a geometric interpretation of uncertainty without the need for explicit probabilistic modeling. To achieve this, we introduce a multimodal latent transition model and a sensor fusion mechanism based on inverse distance weighting, allowing for the adaptive integration of multiple sensor modalities without prior knowledge of noise distributions. We empirically validate the approach on a range of multimodal RL tasks, demonstrating improved robustness to sensor noise and superior state estimation compared to baseline methods. Our experiments show enhanced performance of an RL agent via the learned representation, eliminating the need of explicit noise augmentation. The presented results suggest that leveraging transition-aware metric spaces provides a principled and scalable solution for robust state estimation in sequential decision-making.

标签

强化学习 状态估计 多模态学习 度量学习 传感器融合

arXiv 分类

cs.LG