LLM Reasoning 相关度: 9/10

Can LLMs Model Incorrect Student Reasoning? A Case Study on Distractor Generation

Yanick Zengaffinen, Andreas Opedal, Donya Rooein, Kv Aditya Srivatsa, Shashank Sonkar, Mrinmaya Sachan
arXiv: 2603.15547v1 发布: 2026-03-16 更新: 2026-03-16

AI 摘要

论文研究LLM在生成迷惑选项时模拟学生错误推理的能力,并分析其策略和失败模式。

主要贡献

  • 提出分析LLM生成迷惑选项策略的分类方法
  • 分析LLM模拟学生错误推理的流程和失败模式
  • 发现提供正确答案能显著提升迷惑选项的质量

方法论

对LLM生成迷惑选项的策略进行分析,并将其流程与学习科学的最佳实践进行对比,分析LLM的错误类型。

原文摘要

Modeling plausible student misconceptions is critical for AI in education. In this work, we examine how large language models (LLMs) reason about misconceptions when generating multiple-choice distractors, a task that requires modeling incorrect yet plausible answers by coordinating solution knowledge, simulating student misconceptions, and evaluating plausibility. We introduce a taxonomy for analyzing the strategies used by state-of-the-art LLMs, examining their reasoning procedures and comparing them to established best practices in the learning sciences. Our structured analysis reveals a surprising alignment between their processes and best practices: the models typically solve the problem correctly first, then articulate and simulate multiple potential misconceptions, and finally select a set of distractors. An analysis of failure modes reveals that errors arise primarily from failures in recovering the correct solution and selecting among response candidates, rather than simulating errors or structuring the process. Consistent with these results, we find that providing the correct solution in the prompt improves alignment with human-authored distractors by 8%, highlighting the critical role of anchoring to the correct solution when generating plausible incorrect student reasoning. Overall, our analysis offers a structured and interpretable lens into LLMs' ability to model incorrect student reasoning and produce high-quality distractors.

标签

LLM 教育 推理 迷惑选项生成 学生建模

arXiv 分类

cs.CL cs.AI cs.HC