AI Agents 相关度: 9/10

The Kitchen Loop: User-Spec-Driven Development for a Self-Evolving Codebase

Yannick Roy
arXiv: 2603.25697v1 发布: 2026-03-26 更新: 2026-03-26

AI 摘要

提出Kitchen Loop框架,实现基于用户需求规范驱动的自主演化代码库。

主要贡献

  • Kitchen Loop框架
  • 统一信任模型
  • 生产系统验证

方法论

利用LLM作为模拟用户,以Unbeatable Tests验证,并通过Drift Control持续质量测量,实现自主演化。

原文摘要

Code production is now a commodity; the bottleneck is knowing what to build and proving it works. We present the Kitchen Loop, a framework for autonomous, self-evolving software built on a unified trust model: (1) a specification surface enumerating what the product claims to support; (2) 'As a User x 1000', where an LLM agent exercises that surface as a synthetic power user at 1,000x human cadence; (3) Unbeatable Tests, ground-truth verification the code author cannot fake; and (4) Drift Control, continuous quality measurement with automated pause gates. We validate across two production systems over 285+ iterations, producing 1,094+ merged pull requests with zero regressions detected by the regression oracle (methodology in Section 6.1). We observe emergent properties at scale: multi-iteration self-correction chains, autonomous infrastructure healing, and monotonically improving quality gates. The primitives are not new; our contribution is their composition into a production-tested system with the operational discipline that makes long-running autonomous evolution safe.

标签

自主演化 LLM 代码生成 自动化测试

arXiv 分类

cs.SE cs.AI