LLM Reasoning 相关度: 8/10

Do Language Models Know Theo Has a Wife? Investigating the Proviso Problem

Tara Azin, Daniel Dumitrescu, Diana Inkpen, Raj Singh

arXiv: 2603.08358v1 发布: 2026-03-09 更新: 2026-03-09

下载 PDF arXiv 页面

AI 摘要

该论文研究了语言模型在条件句中处理预设问题的能力，发现模型主要依赖浅层模式匹配。

主要贡献

提出了预设投射的诊断数据集
评估了RoBERTa、DeBERTa、LLaMA和Gemma等模型
提供了评估语言模型语用能力和上下文依赖含义的框架

方法论

将预设问题转化为自然语言推理任务，构建诊断数据集，并使用可解释性分析方法评估模型。

原文摘要

We investigate how language models handle the proviso problem, an unresolved issue in pragmatics where presuppositions in conditional sentences diverge between theoretical and human interpretations. We reformulate this phenomenon as a Natural Language Inference task and introduce a diagnostic dataset designed to probe presupposition projection in conditionals. We evaluate RoBERTa, DeBERTa, LLaMA, and Gemma using explainability analyses. The results show that models broadly align with human judgments but rely on shallow pattern matching rather than semantic or pragmatic reasoning. Our work provides the first computational evaluation framework for the proviso problem and highlights the need for diagnostic, multi-method approaches to assess pragmatic competence and context-dependent meaning in language models.

arXiv 分类

cs.CL

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类