Agent Tuning & Optimization 相关度: 8/10

CGL: Advancing Continual GUI Learning via Reinforcement Fine-Tuning

Zhenquan Yao, Zitong Huang, Yihan Zeng, Jianhua Han, Hang Xu, Chun-Mei Feng, Jianwei Ma, Wangmeng Zuo
arXiv: 2603.02951v1 发布: 2026-03-03 更新: 2026-03-03

AI 摘要

CGL框架通过SFT和RL的协同,提升GUI Agent在持续学习中的适应性和技能保持能力。

主要贡献

  • 提出了CGL框架,平衡SFT和RL
  • 引入策略熵引导的SFT比例调整机制
  • 开发了基于GRPO的梯度手术策略
  • 建立了AndroidControl-CL基准测试

方法论

通过策略熵动态调整SFT和RL的权重,并使用梯度手术避免梯度冲突,提升持续学习效果。

原文摘要

Graphical User Interface (GUI) Agents, benefiting from recent advances in multimodal large language models (MLLM), have achieved significant development. However, due to the frequent updates of GUI applications, adapting to new tasks without forgetting old tasks in GUI continual learning remains an open problem. In this work, we reveal that while Supervised Fine-Tuning (SFT) facilitates fast adaptation, it often triggers knowledge overwriting, whereas Reinforcement Learning (RL) demonstrates an inherent resilience that shields prior interaction logic from erasure. Based on this insight, we propose a \textbf{C}ontinual \textbf{G}UI \textbf{L}earning (CGL) framework that dynamically balances adaptation efficiency and skill retention by enhancing the synergy between SFT and RL. Specifically, we introduce an SFT proportion adjustment mechanism guided by policy entropy to dynamically control the weight allocation between the SFT and RL training phases. To resolve explicit gradient interference, we further develop a specialized gradient surgery strategy. By projecting exploratory SFT gradients onto GRPO-based anchor gradients, our method explicitly clips the components of SFT gradients that conflict with GRPO. On top of that, we establish an AndroidControl-CL benchmark, which divides GUI applications into distinct task groups to effectively simulate and evaluate the performance of continual GUI learning. Experimental results demonstrate the effectiveness of our proposed CGL framework across continual learning scenarios. The benchmark, code, and model will be made publicly available.

标签

GUI Agent Continual Learning Reinforcement Learning Supervised Fine-Tuning Gradient Surgery

arXiv 分类

cs.LG cs.CV