AI Agents 相关度: 9/10

Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning

Yansong Ning, Jun Fang, Naiqiang Tan, Hao Liu
arXiv: 2602.04284v1 发布: 2026-02-04 更新: 2026-02-04

AI 摘要

Agent-Omit通过强化学习训练LLM Agent自适应地省略冗余思考和观察,提高效率。

主要贡献

  • 提出Agent-Omit框架,实现LLM Agent自适应省略思考和观察。
  • 引入omit-aware agentic reinforcement learning方法,包含双重采样和定制的省略奖励。
  • 理论证明了省略策略的偏差上限。
  • 实验证明Agent-Omit在效率和效果上优于其他方法。

方法论

通过少量冷启动数据微调,然后使用基于强化学习的agentic learning方法,以双重采样和定制省略奖励机制,训练Agent。

原文摘要

Managing agent thought and observation during multi-turn agent-environment interactions is an emerging strategy to improve agent efficiency. However, existing studies treat the entire interaction trajectories equally, overlooking the thought necessity and observation utility varies across turns. To this end, we first conduct quantitative investigations into how thought and observation affect agent effectiveness and efficiency. Based on our findings, we propose Agent-Omit, a unified training framework that empowers LLM agents to adaptively omit redundant thoughts and observations. Specifically, we first synthesize a small amount of cold-start data, including both single-turn and multi-turn omission scenarios, to fine-tune the agent for omission behaviors. Furthermore, we introduce an omit-aware agentic reinforcement learning approach, incorporating a dual sampling mechanism and a tailored omission reward to incentivize the agent's adaptive omission capability. Theoretically, we prove that the deviation of our omission policy is upper-bounded by KL-divergence. Experimental results on five agent benchmarks show that our constructed Agent-Omit-8B could obtain performance comparable to seven frontier LLM agent, and achieve the best effectiveness-efficiency trade-off than seven efficient LLM agents methods. Our code and data are available at https://github.com/usail-hkust/Agent-Omit.

标签

LLM Agent Reinforcement Learning Efficiency Omission Adaptive

arXiv 分类

cs.AI cs.LG