LLM Reasoning 相关度: 9/10

When to Trust the Cheap Check: Weak and Strong Verification for Reasoning

Shayan Kiyani, Sima Noorani, George Pappas, Hamed Hassani
arXiv: 2602.17633v1 发布: 2026-02-19 更新: 2026-02-19

AI 摘要

提出了一种弱强验证框架,用于平衡LLM推理的成本和可靠性,并设计在线算法控制错误。

主要贡献

  • 形式化弱强验证策略,平衡成本和可靠性
  • 提出衡量指标:错误接受率、错误拒绝率、强验证频率
  • 设计在线算法,无需假设控制接受和拒绝错误

方法论

通过定义弱强验证策略、设计评估指标,并在此基础上构建在线算法实现控制错误。

原文摘要

Reasoning with LLMs increasingly unfolds inside a broader verification loop. Internally, systems use cheap checks, such as self-consistency or proxy rewards, which we call weak verification. Externally, users inspect outputs and steer the model through feedback until results are trustworthy, which we call strong verification. These signals differ sharply in cost and reliability: strong verification can establish trust but is resource-intensive, while weak verification is fast and scalable but noisy and imperfect. We formalize this tension through weak--strong verification policies, which decide when to accept or reject based on weak verification and when to defer to strong verification. We introduce metrics capturing incorrect acceptance, incorrect rejection, and strong-verification frequency. Over population, we show that optimal policies admit a two-threshold structure and that calibration and sharpness govern the value of weak verifiers. Building on this, we develop an online algorithm that provably controls acceptance and rejection errors without assumptions on the query stream, the language model, or the weak verifier.

标签

LLM Reasoning Verification Online Algorithm

arXiv 分类

cs.LG cs.AI stat.ML