AI Agents 相关度: 10/10

I Can't Believe It's Corrupt: Evaluating Corruption in Multi-Agent Governance Systems

Vedanta S P, Ponnurangam Kumaraguru

arXiv: 2603.18894v1 发布: 2026-03-19 更新: 2026-03-19

下载 PDF arXiv 页面

AI 摘要

研究了LLM在多智能体治理系统中腐败问题，强调制度设计的重要性。

主要贡献

评估了LLM在多智能体治理中的腐败现象
发现治理结构比模型本身更能影响腐败结果
强调了制度设计和安全保障的重要性

方法论

通过模拟多智能体治理系统，评估LLM在不同权力结构下的规则遵守情况，并进行人工评估。

原文摘要

Large language models are increasingly proposed as autonomous agents for high-stakes public workflows, yet we lack systematic evidence about whether they would follow institutional rules when granted authority. We present evidence that integrity in institutional AI should be treated as a pre-deployment requirement rather than a post-deployment assumption. We evaluate multi-agent governance simulations in which agents occupy formal governmental roles under different authority structures, and we score rule-breaking and abuse outcomes with an independent rubric-based judge across 28,112 transcript segments. While we advance this position, the core contribution is empirical: among models operating below saturation, governance structure is a stronger driver of corruption-related outcomes than model identity, with large differences across regimes and model--governance pairings. Lightweight safeguards can reduce risk in some settings but do not consistently prevent severe failures. These results imply that institutional design is a precondition for safe delegation: before real authority is assigned to LLM agents, systems should undergo stress testing under governance-like constraints with enforceable rules, auditable logs, and human oversight on high-impact actions.

arXiv 分类

cs.AI cs.MA

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类