LLM Reasoning 相关度: 9/10

Breaking the Chain: A Causal Analysis of LLM Faithfulness to Intermediate Structures

Oleg Somov, Mikhail Chaichuk, Mikhail Seleznyov, Alexander Panchenko, Elena Tutubalina

arXiv: 2603.16475v1 发布: 2026-03-17 更新: 2026-03-17

下载 PDF arXiv 页面

AI 摘要

研究发现LLM在schema引导推理中，中间结构对最终输出的因果影响较弱，更多作为上下文信息。

主要贡献

提出了一种因果评估协议，用于衡量LLM对中间结构的忠实度。
发现LLM对中间结构的表观忠实度脆弱，改变中间结构后预测更新失败。
验证了将最终决策外包给外部工具可以提高LLM的忠实度。

方法论

通过控制中间结构的编辑，观察LLM输出的改变，衡量中间结构对最终决策的因果影响。

原文摘要

Schema-guided reasoning pipelines ask LLMs to produce explicit intermediate structures -- rubrics, checklists, verification queries -- before committing to a final decision. But do these structures causally determine the output, or merely accompany it? We introduce a causal evaluation protocol that makes this directly measurable: by selecting tasks where a deterministic function maps intermediate structures to decisions, every controlled edit implies a unique correct output. Across eight models and three benchmarks, models appear self-consistent with their own intermediate structures but fail to update predictions after intervention in up to 60% of cases -- revealing that apparent faithfulness is fragile once the intermediate structure changes. When derivation of the final decision from the structure is delegated to an external tool, this fragility largely disappears; however, prompts which ask to prioritize the intermediate structure over the original input do not materially close the gap. Overall, intermediate structures in schema-guided pipelines function as influential context rather than stable causal mediators.

arXiv 分类

cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类