LLM Reasoning 相关度: 9/10

NEX: Neuron Explore-Exploit Scoring for Label-Free Chain-of-Thought Selection and Model Ranking

Kang Chen, Zhuoka Feng, Sihan Zhao, Kai Xiong, Junjie Nian, Yaoning Wang, Changyi Xiao, Yixin Cao
arXiv: 2602.05805v1 发布: 2026-02-05 更新: 2026-02-05

AI 摘要

NEX提出了一种无监督的CoT选择和模型排序框架,通过神经元激活模式识别探索与利用阶段。

主要贡献

  • 提出NEX框架,用于无监督CoT选择和模型排序
  • 利用神经元激活模式识别探索与利用阶段
  • 验证了NEX在推理基准和模型融合上的有效性

方法论

NEX通过稀疏激活缓存检测新激活的MLP神经元,使用HMM推断探索与利用阶段,并根据神经元的复用情况进行评分。

原文摘要

Large language models increasingly spend inference compute sampling multiple chain-of-thought traces or searching over merged checkpoints. This shifts the bottleneck from generation to selection, often without supervision on the target distribution. We show entropy-based exploration proxies follow an inverted-U with accuracy, suggesting extra exploration can become redundant and induce overthinking. We propose NEX, a white-box label-free unsupervised scoring framework that views reasoning as alternating E-phase (exploration) and X-phase (exploitation). NEX detects E-phase as spikes in newly activated MLP neurons per token from sparse activation caches, then uses a sticky two-state HMM to infer E-X phases and credits E-introduced neurons by whether they are reused in the following X span. These signals yield interpretable neuron weights and a single Good-Mass Fraction score to rank candidate responses and merged variants without task answers. Across reasoning benchmarks and Qwen3 merge families, NEX computed on a small unlabeled activation set predicts downstream accuracy and identifies better variants; we further validate the E-X signal with human annotations and provide causal evidence via "Effective-vs-Redundant" neuron transfer.

标签

Chain-of-Thought 无监督学习 模型排序 神经元激活 探索与利用

arXiv 分类

cs.AI