Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning
AI 摘要
预训练VLA模型在持续学习中表现出惊人的抗遗忘能力,简单经验回放即可有效。
主要贡献
- 发现预训练VLA模型抗遗忘能力强
- 验证简单经验回放(ER)在VLA上的有效性
- 分析预训练对持续学习性能的关键作用
方法论
采用经验回放(ER)方法,对比预训练VLA模型与scratch训练模型在持续学习任务中的表现,分析预训练的影响。
原文摘要
Continual learning is a long-standing challenge in robot policy learning, where a policy must acquire new skills over time without catastrophically forgetting previously learned ones. While prior work has extensively studied continual learning in relatively small behavior cloning (BC) policy models trained from scratch, its behavior in modern large-scale pretrained Vision-Language-Action (VLA) models remains underexplored. In this work, we found that pretrained VLAs are remarkably resistant to forgetting compared with smaller policy models trained from scratch. Simple Experience Replay (ER) works surprisingly well on VLAs, sometimes achieving zero forgetting even with a small replay data size. Our analysis reveals that pretraining plays a critical role in downstream continual learning performance: large pretrained models mitigate forgetting with a small replay buffer size while maintaining strong forward learning capabilities. Furthermore, we found that VLAs can retain relevant knowledge from prior tasks despite performance degradation during learning new tasks. This knowledge retention enables rapid recovery of seemingly forgotten skills through finetuning. Together, these insights imply that large-scale pretraining fundamentally changes the dynamics of continual learning, enabling models to continually acquire new skills over time with simple replay. Code and more information can be found at https://ut-austin-rpl.github.io/continual-vla