Agent Tuning & Optimization 相关度: 8/10

Self-Correcting VLA: Online Action Refinement via Sparse World Imagination

Chenyv Liu, Wentao Tan, Lei Zhu, Fengling Li, Jingjing Li, Guoli Yang, Heng Tao Shen
arXiv: 2602.21633v1 发布: 2026-02-25 更新: 2026-02-25

AI 摘要

SC-VLA通过稀疏世界想象实现在线动作优化,提升VLA模型在机器人操作任务中的性能。

主要贡献

  • 提出Self-Correcting VLA (SC-VLA)框架
  • 设计稀疏世界想象模块,预测任务进展和未来轨迹趋势
  • 引入在线动作优化模块,基于预测调整轨迹方向

方法论

通过集成预测头进行稀疏世界想象,并利用在线动作优化模块调整轨迹,实现自改进。

原文摘要

Standard vision-language-action (VLA) models rely on fitting statistical data priors, limiting their robust understanding of underlying physical dynamics. Reinforcement learning enhances physical grounding through exploration yet typically relies on external reward signals that remain isolated from the agent's internal states. World action models have emerged as a promising paradigm that integrates imagination and control to enable predictive planning. However, they rely on implicit context modeling, lacking explicit mechanisms for self-improvement. To solve these problems, we propose Self-Correcting VLA (SC-VLA), which achieve self-improvement by intrinsically guiding action refinement through sparse imagination. We first design sparse world imagination by integrating auxiliary predictive heads to forecast current task progress and future trajectory trends, thereby constraining the policy to encode short-term physical evolution. Then we introduce the online action refinement module to reshape progress-dependent dense rewards, adjusting trajectory orientation based on the predicted sparse future states. Evaluations on challenging robot manipulation tasks from simulation benchmarks and real-world settings demonstrate that SC-VLA achieve state-of-the-art performance, yielding the highest task throughput with 16% fewer steps and a 9% higher success rate than the best-performing baselines, alongside a 14% gain in real-world experiments. Code is available at https://github.com/Kisaragi0/SC-VLA.

标签

VLA 机器人操作 强化学习 世界模型 自改进

arXiv 分类

cs.RO cs.AI cs.CV