Multimodal Learning 相关度: 8/10

Contextual Safety Reasoning and Grounding for Open-World Robots

Zachary Ravichadran, David Snyder, Alexander Robey, Hamed Hassani, Vijay Kumar, George J. Pappas
arXiv: 2602.19983v1 发布: 2026-02-23 更新: 2026-02-23

AI 摘要

CORE框架利用VLM进行在线上下文推理和环境感知,实现开放世界中机器人的情境安全。

主要贡献

  • 提出了CORE安全框架,实现基于VLM的上下文安全推理
  • 将上下文安全规则与物理环境对齐,进行空间定位
  • 通过控制屏障函数实现情境安全,并提供概率安全保证

方法论

使用VLM从视觉信息推理上下文安全规则,通过控制屏障函数进行安全控制,并进行概率安全分析。

原文摘要

Robots are increasingly operating in open-world environments where safe behavior depends on context: the same hallway may require different navigation strategies when crowded versus empty, or during an emergency versus normal operations. Traditional safety approaches enforce fixed constraints in user-specified contexts, limiting their ability to handle the open-ended contextual variability of real-world deployment. We address this gap via CORE, a safety framework that enables online contextual reasoning, grounding, and enforcement without prior knowledge of the environment (e.g., maps or safety specifications). CORE uses a vision-language model (VLM) to continuously reason about context-dependent safety rules directly from visual observations, grounds these rules in the physical environment, and enforces the resulting spatially-defined safe sets via control barrier functions. We provide probabilistic safety guarantees for CORE that account for perceptual uncertainty, and we demonstrate through simulation and real-world experiments that CORE enforces contextually appropriate behavior in unseen environments, significantly outperforming prior semantic safety methods that lack online contextual reasoning. Ablation studies validate our theoretical guarantees and underscore the importance of both VLM-based reasoning and spatial grounding for enforcing contextual safety in novel settings. We provide additional resources at https://zacravichandran.github.io/CORE.

标签

机器人安全 视觉语言模型 上下文推理 控制屏障函数

arXiv 分类

cs.RO cs.AI