Bifrost: Steering Strategic Trajectories to Bridge Contextual Gaps for Self-Improving Agents
AI 摘要
Bifrost通过引导轨迹调整,弥合上下文差距,提升自提升智能体的性能。
主要贡献
- 揭示上下文与轨迹之间的相关性
- 提出无需训练的Bifrost方法,利用上下文差异引导轨迹适应
- 在表征层面进行轨迹调整,确保与目标上下文对齐
方法论
利用上下文差异指导先前解决的轨迹,在隐藏状态层面对轨迹进行调整,使其适应目标任务。
原文摘要
Autonomous agents excel in self-improvement through reflection and iterative refinement, which reuse successful task trajectories as in-context examples to assist subsequent reasoning. However, shifting across tasks often introduces a context mismatch. Hence, existing approaches either discard the trajectories or manipulate them using heuristics, leading to a non-negligible fine-tuning cost or unguaranteed performance. To bridge this gap, we reveal a context-trajectory correlation, where shifts of context are highly parallel with shifts of trajectory. Based on this finding, we propose BrIdge contextual gap FoR imprOvised trajectory STeering (Bifrost), a training-free method that leverages context differences to precisely guide the adaptation of previously solved trajectories towards the target task, mitigating the misalignment caused by context shifts. Our trajectory adaptation is conducted at the representation level using agent hidden states, ensuring trajectory transformation accurately aligns with the target context in a shared space. Across diverse benchmarks, Bifrost consistently outperforms existing trajectory reuse and finetuned self-improvement methods, demonstrating that agents can effectively leverage past experiences despite substantial context shifts.