Multimodal Learning 相关度: 8/10

IRIS: A Real-World Benchmark for Inverse Recovery and Identification of Physical Dynamic Systems from Monocular Video

Rasul Khanbayov, Mohamed Rayan Barhdadi, Erchin Serpedin, Hasan Kurban
arXiv: 2603.16432v1 发布: 2026-03-17 更新: 2026-03-17

AI 摘要

IRIS基准数据集,用于从视频中进行物理动态系统的逆向恢复和辨识研究。

主要贡献

  • 构建高保真真实世界视频数据集IRIS
  • 定义标准化评估协议
  • 提供多种基线方法及性能评估

方法论

收集真实视频,测量真实参数,定义评估指标,并评估基线方法。

原文摘要

Unsupervised physical parameter estimation from video lacks a common benchmark: existing methods evaluate on non-overlapping synthetic data, the sole real-world dataset is restricted to single-body systems, and no established protocol addresses governing-equation identification. This work introduces IRIS, a high-fidelity benchmark comprising 220 real-world videos captured at 4K resolution and 60\,fps, spanning both single- and multi-body dynamics with independently measured ground-truth parameters and uncertainty estimates. Each dynamical system is recorded under controlled laboratory conditions and paired with its governing equations, enabling principled evaluation. A standardized evaluation protocol is defined encompassing parameter accuracy, identifiability, extrapolation, robustness, and governing-equation selection. Multiple baselines are evaluated, including a multi-step physics loss formulation and four complementary equation-identification strategies (VLM temporal reasoning, describe-then-classify prompting, CNN-based classification, and path-based labelling), establishing reference performance across all IRIS scenarios and exposing systematic failure modes that motivate future research. The dataset, annotations, evaluation toolkit, and all baseline implementations are publicly released.

标签

物理参数估计 视频分析 动态系统 基准数据集 逆向恢复

arXiv 分类

cs.CV cs.LG