LLM Reasoning 相关度: 9/10

Efficient Reasoning via Thought Compression for Language Segmentation

Qing Zhou, Shiyu Zhang, Yuyu Jia, Junyu Gao, Weiping Ni, Junzheng Wu, Qi Wang
arXiv: 2604.02040v1 发布: 2026-04-02 更新: 2026-04-02

AI 摘要

WISE通过思考压缩实现高效推理,显著减少推理长度,同时保持了优秀的零样本分割性能。

主要贡献

  • 提出WISE框架,通过压缩推理过程加速推理。
  • 引入concise rationale和self-distillation目标。
  • 开发了WISE-S推理策略,通过prompting适应分布偏移。

方法论

训练模型生成 concise rationale -> answer -> explanation 的序列,利用自蒸馏目标强制模型将详细推理压缩成简洁形式,推理时省略explanation。

原文摘要

Chain-of-thought (CoT) reasoning has significantly improved the performance of large multimodal models in language-guided segmentation, yet its prohibitive computational cost, stemming from generating verbose rationales, limits real-world applicability. We introduce WISE (Wisdom from Internal Self-Exploration), a novel paradigm for efficient reasoning guided by the principle of \textit{thinking twice -- once for learning, once for speed}. WISE trains a model to generate a structured sequence: a concise rationale, the final answer, and then a detailed explanation. By placing the concise rationale first, our method leverages autoregressive conditioning to enforce that the concise rationale acts as a sufficient summary for generating the detailed explanation. This structure is reinforced by a self-distillation objective that jointly rewards semantic fidelity and conciseness, compelling the model to internalize its detailed reasoning into a compact form. At inference, the detailed explanation is omitted. To address the resulting conditional distribution shift, our inference strategy, WISE-S, employs a simple prompting technique that injects a brevity-focused instruction into the user's query. This final adjustment facilitates the robust activation of the learned concise policy, unlocking the full benefits of our framework. Extensive experiments show that WISE-S achieves state-of-the-art zero-shot performance on the ReasonSeg benchmark with 58.3 cIoU, while reducing the average reasoning length by nearly \textbf{5$\times$} -- from 112 to just 23 tokens. Code is available at \href{https://github.com/mrazhou/WISE}{WISE}.

标签

语言分割 链式思考 模型压缩 自蒸馏 高效推理

arXiv 分类

cs.CV