LLM Reasoning 相关度: 9/10

Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision

Xiaohan He, Shiyang Feng, Songtao Huang, Lei Bai, Bin Wang, Bo Zhang
arXiv: 2602.12164v1 发布: 2026-02-12 更新: 2026-02-12

AI 摘要

Sci-CoE通过几何共识和稀疏监督,提升LLM在科学推理任务中的鲁棒性和多样性。

主要贡献

  • 提出Sci-CoE框架,实现LLM在科学推理中的自进化。
  • 引入几何奖励机制,综合考虑共识、可靠性和多样性。
  • 实验证明Sci-CoE能增强复杂推理能力和扩展性。

方法论

两阶段框架,先用少量标注数据建立verifier锚点,再利用几何奖励机制在无标注数据上进行自迭代。

原文摘要

Large language models (LLMs) have demonstrated exceptional reasoning capabilities, and co-evolving paradigms have shown promising results in domains such as code and math. However, in scientific reasoning tasks, these models remain fragile due to unreliable solution evaluation and limited diversity in verification strategies. In this work, we propose Sci-CoE, a two-stage scientific co-evolving framework that enables models to self-evolve as both solver and verifier through a transition from sparse supervision to unsupervised learning. In the first stage, the model uses a small set of annotated data to establish fundamental correctness judgment anchors for the Verifier. In the second stage, we introduce a geometric reward mechanism that jointly considers consensus, reliability, and diversity, driving large-scale self-iteration on unlabeled data. Experiments on several general scientific benchmarks demonstrate that Sci-CoE enhances complex reasoning capabilities and exhibits strong scalability, facilitating the construction of more robust and diverse evaluation systems. Codes are available at https://github.com/InternScience/Sci-CoE.

标签

科学推理 自进化 几何共识 LLM

arXiv 分类

cs.AI