AI Agents 相关度: 9/10

Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verification

Masnun Nuha Chowdhury, Nusrat Jahan Beg, Umme Hunny Khan, Syed Rifat Raiyan, Md Kamrul Hasan, Hasan Mahmud
arXiv: 2603.28488v1 发布: 2026-03-30 更新: 2026-03-30

AI 摘要

提出PROClaim框架,通过模拟法庭辩论和渐进式RAG提升LLM在争议性声明验证中的准确性和可靠性。

主要贡献

  • 提出 courtroom-style 的多智能体辩论框架 PROClaim
  • 引入 Progressive RAG (P-RAG) 动态扩展和优化证据
  • 使用证据协商、自我反思和异构多法官聚合

方法论

构建模拟法庭环境,利用角色扮演、渐进式信息检索和多智能体辩论,增强LLM的推理能力。

原文摘要

Large language models (LLMs) remain unreliable for high-stakes claim verification due to hallucinations and shallow reasoning. While retrieval-augmented generation (RAG) and multi-agent debate (MAD) address this, they are limited by one-pass retrieval and unstructured debate dynamics. We propose a courtroom-style multi-agent framework, PROClaim, that reformulates verification as a structured, adversarial deliberation. Our approach integrates specialized roles (e.g., Plaintiff, Defense, Judge) with Progressive RAG (P-RAG) to dynamically expand and refine the evidence pool during the debate. Furthermore, we employ evidence negotiation, self-reflection, and heterogeneous multi-judge aggregation to enforce calibration, robustness, and diversity. In zero-shot evaluations on the Check-COVID benchmark, PROClaim achieves 81.7% accuracy, outperforming standard multi-agent debate by 10.0 percentage points, with P-RAG driving the primary performance gains (+7.5 pp). We ultimately demonstrate that structural deliberation and model heterogeneity effectively mitigate systematic biases, providing a robust foundation for reliable claim verification. Our code and data are publicly available at https://github.com/mnc13/PROClaim.

标签

多智能体 RAG 可信AI 声明验证

arXiv 分类

cs.CL cs.AI cs.MA