Agent Tuning & Optimization 相关度: 9/10

Bilevel Autoresearch: Meta-Autoresearching Itself

Yaonan Qu, Meng Lu

arXiv: 2603.23420v1 发布: 2026-03-24 更新: 2026-03-24

下载 PDF arXiv 页面

AI 摘要

提出Bilevel Autoresearch框架，通过元优化内循环的搜索机制，显著提升了LLM的预训练效果。

主要贡献

提出了Bilevel Autoresearch框架
通过元优化内循环搜索机制，改进了LLM的自研究能力
自动发现了组合优化、多臂老虎机等机制，提升了LLM的探索能力

方法论

使用双层循环，外循环生成并注入新的搜索机制到内循环，内循环优化任务，外循环优化搜索方式。

原文摘要

If autoresearch is itself a form of research, then autoresearch can be applied to research itself. We take this idea literally: we use an autoresearch loop to optimize the autoresearch loop. Every existing autoresearch system -- from Karpathy's single-track loop to AutoResearchClaw's multi-batch extension and EvoScientist's persistent memory -- was improved by a human who read the code, identified a bottleneck, and wrote new code. We ask whether an LLM can do the same, autonomously. We present Bilevel Autoresearch, a bilevel framework where an outer loop meta-optimizes the inner autoresearch loop by generating and injecting new search mechanisms as Python code at runtime. The inner loop optimizes the task; the outer loop optimizes how the inner loop searches. Both loops use the same LLM -- no stronger model is needed at the meta level. On Karpathy's GPT pretraining benchmark, the meta-autoresearch outer loop achieves a 5x improvement over the standard inner loop alone (-0.045 vs. -0.009 val_bpb), while parameter-level adjustment without mechanism change yields no reliable gain. The outer loop autonomously discovers mechanisms from combinatorial optimization, multi-armed bandits, and design of experiments -- without human specification of which domains to explore. These mechanisms succeed by breaking the inner loop's deterministic search patterns, forcing exploration of directions the LLM's priors systematically avoid. The core principle is simple: if autoresearch can meta-autoresearch itself, it can, in principle, meta-autoresearch anything with a measurable objective.

arXiv 分类

cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类