Agent Tuning & Optimization 相关度: 6/10

Understanding the Curse of Unrolling

Sheheryar Mehmood, Florian Knoll, Peter Ochs
arXiv: 2602.19733v1 发布: 2026-02-23 更新: 2026-02-23

AI 摘要

该论文分析了算法展开中导数迭代发散的“诅咒”现象,并提出了缓解方案。

主要贡献

  • 解释了展开诅咒的根源和影响因素
  • 提出了通过截断早期迭代来缓解诅咒的方法
  • 揭示了双层优化中 warm-starting 的隐式截断作用

方法论

采用非渐近分析方法,结合理论推导和数值实验验证。

原文摘要

Algorithm unrolling is ubiquitous in machine learning, particularly in hyperparameter optimization and meta-learning, where Jacobians of solution mappings are computed by differentiating through iterative algorithms. Although unrolling is known to yield asymptotically correct Jacobians under suitable conditions, recent work has shown that the derivative iterates may initially diverge from the true Jacobian, a phenomenon known as the curse of unrolling. In this work, we provide a non-asymptotic analysis that explains the origin of this behavior and identifies the algorithmic factors that govern it. We show that truncating early iterations of the derivative computation mitigates the curse while simultaneously reducing memory requirements. Finally, we demonstrate that warm-starting in bilevel optimization naturally induces an implicit form of truncation, providing a practical remedy. Our theoretical findings are supported by numerical experiments on representative examples.

标签

算法展开 雅可比矩阵 双层优化

arXiv 分类

cs.LG math.OC