AI Agents 相关度: 9/10

Cognitively Diverse Multiple-Choice Question Generation: A Hybrid Multi-Agent Framework with Large Language Models

Yu Tian, Linh Huynh, Katerina Christhilf, Shubham Chakraborty, Micah Watanabe, Tracy Arner, Danielle McNamara
arXiv: 2602.03704v1 发布: 2026-02-03 更新: 2026-02-03

AI 摘要

ReQUESTA框架利用多智能体和LLM生成认知多样化、高质量的多项选择题。

主要贡献

  • 提出ReQUESTA框架,用于生成认知多样化的多项选择题
  • 结合LLM和规则,实现可控的问题生成流程
  • 实验证明ReQUESTA生成的问题质量更高,更具挑战性和区分度

方法论

构建混合多智能体框架,分解问题生成任务,结合LLM和规则,进行规划、生成、评估和后处理。

原文摘要

Recent advances in large language models (LLMs) have made automated multiple-choice question (MCQ) generation increasingly feasible; however, reliably producing items that satisfy controlled cognitive demands remains a challenge. To address this gap, we introduce ReQUESTA, a hybrid, multi-agent framework for generating cognitively diverse MCQs that systematically target text-based, inferential, and main idea comprehension. ReQUESTA decomposes MCQ authoring into specialized subtasks and coordinates LLM-powered agents with rule-based components to support planning, controlled generation, iterative evaluation, and post-processing. We evaluated the framework in a large-scale reading comprehension study using academic expository texts, comparing ReQUESTA-generated MCQs with those produced by a single-pass GPT-5 zero-shot baseline. Psychometric analyses of learner responses assessed item difficulty and discrimination, while expert raters evaluated question quality across multiple dimensions, including topic relevance and distractor quality. Results showed that ReQUESTA-generated items were consistently more challenging, more discriminative, and more strongly aligned with overall reading comprehension performance. Expert evaluations further indicated stronger alignment with central concepts and superior distractor linguistic consistency and semantic plausibility, particularly for inferential questions. These findings demonstrate that hybrid, agentic orchestration can systematically improve the reliability and controllability of LLM-based generation, highlighting workflow design as a key lever for structured artifact generation beyond single-pass prompting.

标签

多项选择题生成 大型语言模型 智能体 阅读理解 认知评估

arXiv 分类

cs.CL cs.AI