AI Agents 相关度: 9/10

The Specification Gap: Coordination Failure Under Partial Knowledge in Code Agents

Camilo Chacón Sartori
arXiv: 2603.24284v1 发布: 2026-03-25 更新: 2026-03-25

AI 摘要

研究代码智能体在信息不充分情况下协同开发时的规范缺失问题,揭示规范完整性的重要性。

主要贡献

  • 揭示了多智能体代码生成中的规范缺失问题及其影响
  • 提出了基于AST的冲突检测器
  • 分析了协同成本和信息不对称对性能的影响

方法论

通过控制代码规范的详细程度,对比单智能体和多智能体的代码生成准确率,并进行冲突检测和恢复实验。

原文摘要

When multiple LLM-based code agents independently implement parts of the same class, they must agree on shared internal representations, even when the specification leaves those choices implicit. We study this coordination problem across 51 class-generation tasks, progressively stripping specification detail from full docstrings (L0) to bare signatures (L3), and introducing opposing structural biases (lists vs. dictionaries) to stress-test integration. Three findings emerge. First, a persistent specification gap: two-agent integration accuracy drops from 58% to 25% as detail is removed, while a single-agent baseline degrades more gracefully (89% to 56%), leaving a 25--39 pp coordination gap that is consistent across two Claude models (Sonnet, Haiku) and three independent runs. Second, an AST-based conflict detector achieves 97% precision at the weakest specification level without additional LLM calls, yet a factorial recovery experiment shows that restoring the full specification alone recovers the single-agent ceiling (89%), while providing conflict reports adds no measurable benefit. Third, decomposing the gap into coordination cost (+16 pp) and information asymmetry (+11 pp) suggests that the two effects are independent and approximately additive. The gap is not merely a consequence of hidden information, but reflects the difficulty of producing compatible code without shared decisions. These results support a specification-first view of multi-agent code generation: richer specifications are both the primary coordination mechanism and the sufficient recovery instrument.

标签

代码智能体 协同开发 规范缺失 冲突检测

arXiv 分类

cs.SE cs.AI cs.MA