LLM Reasoning 相关度: 8/10

The Silent Thought: Modeling Internal Cognition in Full-Duplex Spoken Dialogue Models via Latent Reasoning

Donghang Wu, Tianyu Zhang, Yuxin Li, Hexin Liu, Chen Chen, Eng Siong Chng, Yoshua Bengio
arXiv: 2603.17837v1 发布: 2026-03-18 更新: 2026-03-18

AI 摘要

FLAIR模型通过潜变量推理模拟人脑边听边思考的机制,提升全双工对话系统性能。

主要贡献

  • 提出FLAIR模型,模拟对话中的内部认知过程
  • 设计了基于ELBO的目标函数,用于有效监督微调
  • 在多个语音基准测试上取得了竞争力的结果

方法论

提出一种全双工潜变量和内部推理方法,在用户说话时进行潜变量推理,并结合ELBO目标函数进行监督微调。

原文摘要

During conversational interactions, humans subconsciously engage in concurrent thinking while listening to a speaker. Although this internal cognitive processing may not always manifest as explicit linguistic structures, it is instrumental in formulating high-quality responses. Inspired by this cognitive phenomenon, we propose a novel Full-duplex LAtent and Internal Reasoning method named FLAIR that conducts latent thinking simultaneously with speech perception. Unlike conventional "thinking" mechanisms in NLP, which require post-hoc generation, our approach aligns seamlessly with spoken dialogue systems: during the user's speaking phase, it recursively feeds the latent embedding output from the previous step into the next step, enabling continuous reasoning that strictly adheres to causality without introducing additional latency. To enable this latent reasoning, we design an Evidence Lower Bound-based objective that supports efficient supervised finetuning via teacher forcing, circumventing the need for explicit reasoning annotations. Experiments demonstrate the effectiveness of this think-while-listening design, which achieves competitive results on a range of speech benchmarks. Furthermore, FLAIR robustly handles conversational dynamics and attains competitive performance on full-duplex interaction metrics.

标签

全双工对话系统 潜变量推理 语音识别 自然语言处理

arXiv 分类

eess.AS cs.CL