Agent Tuning & Optimization 相关度: 9/10

DARWIN: Dynamic Agentically Rewriting Self-Improving Network

Henry Jiang
arXiv: 2602.05848v1 发布: 2026-02-05 更新: 2026-02-05

AI 摘要

DARWIN利用遗传算法优化GPT模型,实现自改进,提升模型性能。

主要贡献

  • 提出DARWIN框架,一种基于遗传算法的GPT模型优化方法
  • 利用GPT agent 修改其他agent的训练代码
  • 通过实验验证了该方法在提升MFU和困惑度方面的有效性

方法论

使用类似遗传算法的结构,多个GPT agent独立训练,相互修改训练代码,并进行性能评估和选择。

原文摘要

DARWIN is an evolutionary GPT model, utilizing a genetic-algorithm like optimization structure with several independent GPT agents being trained individually using unique training code. Each iteration, the GPT models are prompted to modify the training code of one another in an attempt to improve their performance in a mutation-like manner, and the best GPT agents are then benchmarked and selected for the next iteration by genetic algorithm. For demonstration purposes and due to budget and time constraints, OpenAI API is used to prompt training code improvements and the nanoGPT framework is used as the training code. DARWIN also utilizes persistent JSON-based memory files to track previous reasoning and changes to code to correlate with improvement to model performance. and a bidirectional interface for HITL intervention allowing the model to request upgrades such as additional datasets, training scripts, and restructuring of file hierarchies. In experiments, DARWIN achieved a 1.26 percent improvement in model FLOPS utilization (MFU) and a 2.07 percent improvement to perplexity in 5 iterations of training over baseline configurations, demonstrating promising capabilities as a foundation for scaling evolutionary GPT training.

标签

自改进 遗传算法 GPT Agent

arXiv 分类

cs.NE cs.AI cs.CL