AI Agents 相关度: 8/10

Robust Intervention Learning from Emergency Stop Interventions

Ethan Pronovost, Khimya Khetarpal, Siddhartha Srinivasa
arXiv: 2602.03825v1 发布: 2026-02-03 更新: 2026-02-03

AI 摘要

提出Residual Intervention Fine-Tuning算法,从紧急停止干预中进行鲁棒学习,提升自动驾驶系统性能。

主要贡献

  • 提出Robust Intervention Learning (RIL)问题定义
  • 提出Residual Intervention Fine-Tuning (RIFT)算法
  • 提供理论分析,表征算法的改进条件

方法论

将干预学习视为微调问题,利用先验策略的信息,通过残差微调的方式结合干预反馈信号。

原文摘要

Human interventions are a common source of data in autonomous systems during testing. These interventions provide an important signal about where the current policy needs improvement, but are often noisy and incomplete. We define Robust Intervention Learning (RIL) as the problem of learning from intervention data while remaining robust to the quality and informativeness of the intervention signal. In the best case, interventions are precise and avoiding them is sufficient to solve the task, but in many realistic settings avoiding interventions is necessary but not sufficient for achieving good performance. We study robust intervention learning in the context of emergency stop interventions and propose Residual Intervention Fine-Tuning (RIFT), a residual fine-tuning algorithm that treats intervention feedback as an incomplete learning signal and explicitly combines it with a prior policy. By framing intervention learning as a fine-tuning problem, our approach leverages structure encoded in the prior policy to resolve ambiguity when intervention signals under-specify the task. We provide theoretical analysis characterizing conditions under which this formulation yields principled policy improvement, and identify regimes where intervention learning is expected to fail. Our experiments reveal that residual fine-tuning enables robust and consistent policy improvement across a range of intervention strategies and prior policy qualities, and highlight robust intervention learning as a promising direction for future work.

标签

强化学习 干预学习 自动驾驶 微调

arXiv 分类

cs.LG