LLM Reasoning 相关度: 9/10

Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Reasoning

Boren Hu, Xiao Liu, Boci Peng, Xinping Zhao, Xiaoran Shang, Yun Zhu, Lijun Wu

arXiv: 2603.05120v1 发布: 2026-03-05 更新: 2026-03-05

下载 PDF arXiv 页面

AI 摘要

提出双向课程生成框架，通过多智能体自适应调整问题难度，提升LLM数学推理的数据效率。

主要贡献

提出双向课程生成框架
构建多智能体生态模拟自适应教学
优化学习轨迹，提升数据效率

方法论

构建多智能体环境，动态生成由难到易或由易到难的问题，基于模型反馈循环优化训练数据。

原文摘要

Enhancing mathematical reasoning in Large Language Models typically demands massive datasets, yet data efficiency remains a critical bottleneck. While Curriculum Learning attempts to structure this process, standard unidirectional approaches (simple-to-complex) suffer from inefficient sample utilization: they blindly escalate complexity even when foundational gaps persist, leading to wasted computation on unsolvable problems. To maximize the instructional value of every training sample, we introduce a novel Bidirectional Curriculum Generation framework. Unlike rigid trajectories, our multi-agent ecosystem mimics adaptive pedagogy to establish a closed feedback loop. It dynamically generates data by either complicating problems to challenge the model or, crucially, simplying them to repair specific reasoning failures. This mechanism ensures that the model consumes only the most effective data at any given stage. Grounded in the Optimal Pacing Theorem, our approach optimizes the learning trajectory, significantly outperforming baselines while achieving superior reasoning performance with substantially fewer instruction samples.

arXiv 分类

cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类