Telogenesis: Goal Is All U Need
AI 摘要
该论文提出利用内在认知状态驱动目标导向系统,无需外部奖励即可生成自适应优先级。
主要贡献
- 提出了一种基于认知状态的优先级函数,包括ignorance, surprise, 和staleness。
- 验证了该优先级函数在环境中的有效性。
- 发现metric-dependent reversal现象,不同指标下最优策略不同。
- 证明该系统可以无监督地恢复环境的潜在结构。
方法论
通过构建一个优先级函数,利用 epistemic gaps(ignorance, surprise, staleness)生成观察目标,并在两个环境中进行实验验证。
原文摘要
Goal-conditioned systems assume goals are provided externally. We ask whether attentional priorities can emerge endogenously from an agent's internal cognitive state. We propose a priority function that generates observation targets from three epistemic gaps: ignorance (posterior variance), surprise (prediction error), and staleness (temporal decay of confidence in unobserved variables). We validate this in two systems: a minimal attention-allocation environment (2,000 runs) and a modular, partially observable world (500 runs). Ablation shows each component is necessary. A key finding is metric-dependent reversal: under global prediction error, coverage-based rotation wins; under change detection latency, priority-guided allocation wins, with advantage growing monotonically with dimensionality (d = -0.95 at N=48, p < 10^-6). Detection latency follows a power law in attention budget, with a steeper exponent for priority-guided allocation (0.55 vs. 0.40). When the decay rate is made learnable per variable, the system spontaneously recovers environmental volatility structure without supervision (t = 22.5, p < 10^-6). We demonstrate that epistemic gaps alone, without external reward, suffice to generate adaptive priorities that outperform fixed strategies and recover latent environmental structure.