AI Agents 相关度: 9/10

Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?

Thibaud Gloaguen, Niels Mündler, Mark Müller, Veselin Raychev, Martin Vechev
arXiv: 2602.11988v1 发布: 2026-02-12 更新: 2026-02-12

AI 摘要

研究表明,仓库级上下文文件(如AGENTS.md)反而降低了编码agent的任务成功率并增加推理成本。

主要贡献

  • 首次系统性评估了仓库级上下文文件对编码agent性能的影响
  • 发现LLM生成和开发者提供的上下文文件均降低了任务成功率
  • 发现上下文文件会增加agent的探索广度,但也增加了推理成本

方法论

通过在SWE-bench任务和真实仓库issue上,对比使用和不使用上下文文件的编码agent性能,评估上下文文件的有效性。

原文摘要

A widespread practice in software development is to tailor coding agents to repositories using context files, such as AGENTS.md, by either manually or automatically generating them. Although this practice is strongly encouraged by agent developers, there is currently no rigorous investigation into whether such context files are actually effective for real-world tasks. In this work, we study this question and evaluate coding agents' task completion performance in two complementary settings: established SWE-bench tasks from popular repositories, with LLM-generated context files following agent-developer recommendations, and a novel collection of issues from repositories containing developer-committed context files. Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%. Behaviorally, both LLM-generated and developer-provided context files encourage broader exploration (e.g., more thorough testing and file traversal), and coding agents tend to respect their instructions. Ultimately, we conclude that unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements.

标签

AI Agents Code Generation Contextual Information Evaluation

arXiv 分类

cs.SE cs.AI