Knowledge Integration Decay in Search-Augmented Reasoning of Large Language Models
AI 摘要
该论文发现LLM在搜索增强推理中存在知识整合衰减问题,并提出SAKE方法缓解该问题。
主要贡献
- 发现知识整合衰减问题 (KID)
- 提出 Self-Anchored Knowledge Encoding (SAKE) 方法
- 实验证明SAKE能有效缓解KID并提升性能
方法论
提出训练无关的SAKE方法,在推理过程中将检索到的知识固定在推理过程的开始和结束,以保持知识的完整性。
原文摘要
Modern Large Language Models (LLMs) have demonstrated remarkable capabilities in complex tasks by employing search-augmented reasoning to incorporate external knowledge into long chains of thought. However, we identify a critical yet underexplored bottleneck in this paradigm, termed Knowledge Integration Decay (KID). Specifically, we observe that as the length of reasoning generated before search grows, models increasingly fail to integrate retrieved evidence into subsequent reasoning steps, limiting performance even when relevant information is available. To address this, we propose Self-Anchored Knowledge Encoding (SAKE), a training-free inference-time strategy designed to stabilize knowledge utilization. By anchoring retrieved knowledge at both the beginning and end of the reasoning process, SAKE prevents it from being overshadowed by prior context, thereby preserving its semantic integrity. Extensive experiments on multi-hop QA and complex reasoning benchmarks demonstrate that SAKE significantly mitigates KID and improves performance, offering a lightweight yet effective solution for knowledge integration in agentic LLMs.