Agent Tuning & Optimization 相关度: 8/10

RocqSmith: Can Automatic Optimization Forge Better Proof Agents?

Andrei Kozyrev, Nikita Khramov, Denis Lochmelis, Valerio Morelli, Gleb Solovev, Anton Podkopaev
arXiv: 2602.05762v1 发布: 2026-02-05 更新: 2026-02-05

AI 摘要

研究AI自动优化方法在Rocq定理证明Agent中的应用,评估其优化Agent策略的能力。

主要贡献

  • 评估了不同优化器在Rocq定理证明Agent上的效果
  • 发现few-shot bootstrapping方法效果较好
  • 发现自动优化方法与人工设计的Agent相比仍有差距

方法论

采用实验方法,将不同的自动Agent优化器应用于Rocq证明生成Agent,并评估其性能。

原文摘要

This work studies the applicability of automatic AI agent optimization methods to real-world agents in formal verification settings, focusing on automated theorem proving in Rocq as a representative and challenging domain. We evaluate how different automatic agent optimizers perform when applied to the task of optimizing a Rocq proof-generation agent, and assess whether parts of the fine-grained tuning of agentic systems, such as prompt design, contextual knowledge, and control strategies, can be automated. Our results show that while several optimizers yield measurable improvements, simple few-shot bootstrapping is the most consistently effective; however, none of the studied methods matches the performance of a carefully engineered state-of-the-art proof agent.

标签

自动定理证明 AI Agent优化 Rocq Agent tuning

arXiv 分类

cs.AI cs.LG cs.LO cs.SE