LLM Reasoning 相关度: 9/10

Therefore I am. I Think

Esakkivel Esakkiraja, Sai Rajeswar, Denis Akhiyarov, Rajagopal Venkatesaramani
arXiv: 2604.01202v1 发布: 2026-04-01 更新: 2026-04-01

AI 摘要

大语言模型在推理前已做出决策,推理过程倾向于合理化既定选择。

主要贡献

  • 揭示了决策在推理过程中的提前编码现象
  • 通过激活操控验证了决策对推理过程的因果影响
  • 分析了模型如何合理化决策反转

方法论

利用线性探针解码预生成激活中的决策信息,并使用激活操控干预决策方向,观察模型行为变化。

原文摘要

We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable, early-encoded decisions shape chain-of-thought in reasoning models. Specifically, we show that a simple linear probe successfully decodes tool-calling decisions from pre-generation activations with very high confidence, and in some cases, even before a single reasoning token is produced. Activation steering supports this causally: perturbing the decision direction leads to inflated deliberation, and flips behavior in many examples (between 7 - 79% depending on model and benchmark). We also show through behavioral analysis that, when steering changes the decision, the chain-of-thought process often rationalizes the flip rather than resisting it. Together, these results suggest that reasoning models can encode action choices before they begin to deliberate in text.

标签

大语言模型 推理 因果关系 激活操控 决策编码

arXiv 分类

cs.AI