LLM Reasoning 相关度: 9/10

InjectRBP: Steering Large Language Model Reasoning Behavior via Pattern Injection

Xiuping Wu, Zhao Yu, Yuxin Cheng, Ngai Wong, Liangjun Ke, Tapas Mishra, Konstantinos V. Katsikopoulos
arXiv: 2602.12013v1 发布: 2026-02-12 更新: 2026-02-12

AI 摘要

通过注入行为模式引导大语言模型的推理过程,无需更新模型参数即可提升推理性能。

主要贡献

  • 观察到模型推理行为的自适应分布
  • 提出 InjectCorrect 和 InjectRLOpt 两种无参数优化的推理引导方法
  • 实验证明了方法的有效性

方法论

通过模仿历史正确答案的行为模式或学习行为模式的价值函数,生成行为注入剂来引导推理。

原文摘要

Reasoning can significantly enhance the performance of Large Language Models. While recent studies have exploited behavior-related prompts adjustment to enhance reasoning, these designs remain largely intuitive and lack a systematic analysis of the underlying behavioral patterns. Motivated by this, we investigate how models' reasoning behaviors shape reasoning from the perspective of behavioral patterns. We observe that models exhibit adaptive distributions of reasoning behaviors when responding to specific types of questions, and that structurally injecting these patterns can substantially influence the quality of the models' reasoning processes and outcomes. Building on these findings, we propose two optimization methods that require no parameter updates: InjectCorrect and InjectRLOpt. InjectCorrect guides the model by imitating behavioral patterns derived from its own past correct answers. InjectRLOpt learns a value function from historical behavior-pattern data and, via our proposed Reliability-Aware Softmax Policy, generates behavioral injectant during inference to steer the reasoning process. Our experiments demonstrate that both methods can improve model performance across various reasoning tasks without requiring any modifications to model parameters, achieving gains of up to 5.34% and 8.67%, respectively.

标签

推理 行为模式 提示调整 无参数优化

arXiv 分类

cs.AI