LLM Memory & RAG 相关度: 9/10

IndexRAG: Bridging Facts for Cross-Document Reasoning at Index Time

Zhenghua Bao, Yi Shi
arXiv: 2603.16415v1 发布: 2026-03-17 更新: 2026-03-17

AI 摘要

IndexRAG通过离线构建桥接事实,提升跨文档推理的检索增强生成效果,无需额外训练。

主要贡献

  • 提出IndexRAG,一种新型跨文档推理的检索增强生成方法
  • 将跨文档推理从在线推断转移到离线索引
  • 通过桥接实体生成桥接事实作为可检索单元

方法论

IndexRAG识别文档间共享的桥接实体,生成桥接事实,并将其作为独立的检索单元,在检索时进行匹配。

原文摘要

Multi-hop question answering (QA) requires reasoning across multiple documents, yet existing retrieval-augmented generation (RAG) approaches address this either through graph-based methods requiring additional online processing or iterative multi-step reasoning. We present IndexRAG, a novel approach that shifts cross-document reasoning from online inference to offline indexing. IndexRAG identifies bridge entities shared across documents and generates bridging facts as independently retrievable units, requiring no additional training or fine-tuning. Experiments on three widely-used multi-hop QA benchmarks (HotpotQA, 2WikiMultiHopQA, MuSiQue) show that IndexRAG improves F1 over Naive RAG by 4.6 points on average, while requiring only single-pass retrieval and a single LLM call at inference time. When combined with IRCoT, IndexRAG outperforms all graph-based baselines on average, including HippoRAG and FastGraphRAG, while relying solely on flat retrieval. Our code will be released upon acceptance.

标签

RAG 跨文档推理 多跳问答 索引

arXiv 分类

cs.CL cs.AI cs.IR