LLM Reasoning 相关度: 9/10

From Growing to Looping: A Unified View of Iterative Computation in LLMs

Ferdinand Kapl, Emmanouil Angelis, Kaitlin Maile, Johannes von Oswald, Stefan Bauer

arXiv: 2602.16490v1 发布: 2026-02-18 更新: 2026-02-18

下载 PDF arXiv 页面

AI 摘要

论文统一了LLM中循环和深度增长两种迭代计算方法，并证明了它们之间的互补性。

主要贡献

提出了循环和深度增长模型的统一视角
证明了循环和深度增长模型具有收敛的深度方向特征
展示了这两种技术的可适应性和可组合性

方法论

通过实验分析循环和深度增长模型的内部机制，并验证了在推理和微调中的性能表现。

原文摘要

Looping, reusing a block of layers across depth, and depth growing, training shallow-to-deep models by duplicating middle layers, have both been linked to stronger reasoning, but their relationship remains unclear. We provide a mechanistic unification: looped and depth-grown models exhibit convergent depth-wise signatures, including increased reliance on late layers and recurring patterns aligned with the looped or grown block. These shared signatures support the view that their gains stem from a common form of iterative computation. Building on this connection, we show that the two techniques are adaptable and composable: applying inference-time looping to the middle blocks of a depth-grown model improves accuracy on some reasoning primitives by up to $2\times$, despite the model never being trained to loop. Both approaches also adapt better than the baseline when given more in-context examples or additional supervised fine-tuning data. Additionally, depth-grown models achieve the largest reasoning gains when using higher-quality, math-heavy cooldown mixtures, which can be further boosted by adapting a middle block to loop. Overall, our results position depth growth and looping as complementary, practical methods for inducing and scaling iterative computation to improve reasoning.

arXiv 分类

cs.CL cs.AI cs.LG

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类