SPD-RAG: Sub-Agent Per Document Retrieval-Augmented Generation
AI 摘要
SPD-RAG通过多Agent分工协作,在多文档QA任务中提高了性能和效率,降低了API成本。
主要贡献
- 提出SPD-RAG框架,利用多Agent处理多文档QA
- 采用文档级Agent专注于自身内容,提高检索精度
- 使用token-bounded synthesis layer聚合Agent输出,提升可扩展性
方法论
构建层级多Agent框架,每个文档由专属Agent处理,协调器分配任务并聚合结果,通过token限制合成层融合答案。
原文摘要
Answering complex, real-world queries often requires synthesizing facts scattered across vast document corpora. In these settings, standard retrieval-augmented generation (RAG) pipelines suffer from incomplete evidence coverage, while long-context large language models (LLMs) struggle to reason reliably over massive inputs. We introduce SPD-RAG, a hierarchical multi-agent framework for exhaustive cross-document question answering that decomposes the problem along the document axis. Each document is processed by a dedicated document-level agent operating only on its own content, enabling focused retrieval, while a coordinator dispatches tasks to relevant agents and aggregates their partial answers. Agent outputs are synthesized by merging partial answers through a token-bounded synthesis layer (which supports recursive map-reduce for massive corpora). This document-level specialization with centralized fusion improves scalability and answer quality in heterogeneous multidocument settings while yielding a modular, extensible retrieval pipeline. On the LOONG benchmark (EMNLP 2024) for long-context multi-document QA, SPD-RAG achieves an Avg Score of 58.1 (GPT-5 evaluation), outperforming Normal RAG (33.0) and Agentic RAG (32.8) while using only 38% of the API cost of a full-context baseline (68.0).