LLM Reasoning 相关度: 9/10

Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling

Shiqi Yan, Yubo Chen, Ruiqi Zhou, Zhengxi Yao, Shuai Chen, Tianyi Zhang, Shijie Zhang, Wei Qiang Zhang, Yongfeng Huang, Haixin Duan, Yunqi Zhang
arXiv: 2602.21728v1 发布: 2026-02-25 更新: 2026-02-25

AI 摘要

提出Explore-on-Graph框架,通过强化学习鼓励LLM在知识图谱上自主探索推理路径,提升推理能力。

主要贡献

  • 提出Explore-on-Graph框架
  • 引入强化学习激励LLM探索
  • 使用路径信息作为额外奖励信号优化探索过程

方法论

使用强化学习训练LLM,奖励为推理路径答案的正确性,并结合路径信息作为额外奖励信号。

原文摘要

The reasoning process of Large Language Models (LLMs) is often plagued by hallucinations and missing facts in question-answering tasks. A promising solution is to ground LLMs' answers in verifiable knowledge sources, such as Knowledge Graphs (KGs). Prevailing KG-enhanced methods typically constrained LLM reasoning either by enforcing rules during generation or by imitating paths from a fixed set of demonstrations. However, they naturally confined the reasoning patterns of LLMs within the scope of prior experience or fine-tuning data, limiting their generalizability to out-of-distribution graph reasoning problems. To tackle this problem, in this paper, we propose Explore-on-Graph (EoG), a novel framework that encourages LLMs to autonomously explore a more diverse reasoning space on KGs. To incentivize exploration and discovery of novel reasoning paths, we propose to introduce reinforcement learning during training, whose reward is the correctness of the reasoning paths' final answers. To enhance the efficiency and meaningfulness of the exploration, we propose to incorporate path information as additional reward signals to refine the exploration process and reduce futile efforts. Extensive experiments on five KGQA benchmark datasets demonstrate that, to the best of our knowledge, our method achieves state-of-the-art performance, outperforming not only open-source but also even closed-source LLMs.

标签

知识图谱 LLM 强化学习 推理 探索

arXiv 分类

cs.CL