AI Agents 相关度: 9/10

CoE: Collaborative Entropy for Uncertainty Quantification in Agentic Multi-LLM Systems

Kangkang Sun, Jun Wu, Jianhua Li, Minyi Guo, Xiuzhen Che, Jianwei Huang
arXiv: 2603.28360v1 发布: 2026-03-30 更新: 2026-03-30

AI 摘要

提出了Collaborative Entropy (CoE)用于多LLM系统中的不确定性量化,提升系统级语义不确定性评估。

主要贡献

  • 提出了CoE,一种多LLM协作中的语义不确定性度量方法
  • CoE结合了模型内部语义熵和模型间差异,以评估系统级不确定性
  • CoE可以指导无训练的后验协调,提高多LLM系统的可靠性

方法论

定义在共享语义簇空间,通过计算模型内部语义熵和模型间与整体均值的差异,量化多LLM系统的不确定性。

原文摘要

Uncertainty estimation in multi-LLM systems remains largely single-model-centric: existing methods quantify uncertainty within each model but do not adequately capture semantic disagreement across models. To address this gap, we propose Collaborative Entropy (CoE), a unified information-theoretic metric for semantic uncertainty in multi-LLM collaboration. CoE is defined on a shared semantic cluster space and combines two components: intra-model semantic entropy and inter-model divergence to the ensemble mean. CoE is not a weighted ensemble predictor; it is a system-level uncertainty measure that characterizes collaborative confidence and disagreement. We analyze several core properties of CoE, including non-negativity, zero-value certainty under perfect semantic consensus, and the behavior of CoE when individual models collapse to delta distributions. These results clarify when reducing per-model uncertainty is sufficient and when residual inter-model disagreement remains. We also present a simple CoE-guided, training-free post-hoc coordination heuristic as a practical application of the metric. Experiments on \textit{TriviaQA} and \textit{SQuAD} with LLaMA-3.1-8B-Instruct, Qwen-2.5-7B-Instruct, and Mistral-7B-Instruct show that CoE provides stronger uncertainty estimation than standard entropy- and divergence-based baselines, with gains becoming larger as additional heterogeneous models are introduced. Overall, CoE offers a useful uncertainty-aware perspective on multi-LLM collaboration.

标签

multi-LLM uncertainty quantification information theory semantic disagreement

arXiv 分类

cs.AI