AI Agents 相关度: 7/10

Mechanistic Foundations of Goal-Directed Control

Alma Lago
arXiv: 2603.15248v1 发布: 2026-03-16 更新: 2026-03-16

AI 摘要

论文将机械可解释性扩展到具身控制系统,研究了婴儿运动学习中目标导向控制的机制基础。

主要贡献

  • 将机械可解释性应用于具身控制系统
  • 揭示了控制电路形成的诱导偏置和门控机制
  • 发现了反应性和前瞻性控制策略竞争的机制

方法论

通过分析婴儿运动学习,识别因果控制电路,并建立理论驱动的不确定性阈值的学习门控机制。

原文摘要

Mechanistic interpretability has transformed the analysis of transformer circuits by decomposing model behavior into competing algorithms, identifying phase transitions during training, and deriving closed-form predictions for when and why strategies shift. However, this program has remained largely confined to sequence-prediction architectures, leaving embodied control systems without comparable mechanistic accounts. Here we extend this framework to sensorimotor-cognitive development, using infant motor learning as a model system. We show that foundational inductive biases give rise to causal control circuits, with learned gating mechanisms converging toward theoretically motivated uncertainty thresholds. The resulting dynamics reveal a clean phase transition in the arbitration gate whose commitment behavior is well described by a closed-form exponential moving-average surrogate. We identify context window k as the critical parameter governing circuit formation: below a minimum threshold (k$\leq$4) the arbitration mechanism cannot form; above it (k$\geq$8), gate confidence scales asymptotically as log k. A two-dimensional phase diagram further reveals task-demand-dependent route arbitration consistent with the prediction that prospective execution becomes advantageous only when prediction error remains within the task tolerance window. Together, these results provide a mechanistic account of how reactive and prospective control strategies emerge and compete during learning. More broadly, this work sharpens mechanistic accounts of cognitive development and provides principled guidance for the design of interpretable embodied agents.

标签

机械可解释性 具身控制 婴儿运动学习 目标导向控制

arXiv 分类

cs.LG eess.SY