LLM Memory & RAG 相关度: 9/10

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

Bingbing Wang, Jing Li, Ruifeng Xu
arXiv: 2603.04885v1 发布: 2026-03-05 更新: 2026-03-05

AI 摘要

提出ProStream框架,解决无限对话流中的有界状态记忆和即时回忆问题,并构建STEM-Bench评估。

主要贡献

  • 构建STEM-Bench基准测试
  • 提出ProStream框架,解决记忆效率和准确率的平衡问题
  • 引入自适应时空优化

方法论

提出分层记忆框架ProStream,通过多粒度蒸馏和自适应时空优化,实现高效准确的对话记忆和即时回忆。

原文摘要

Real-world dialogue usually unfolds as an infinite stream. It thus requires bounded-state memory mechanisms to operate within an infinite horizon. However, existing read-then-think memory is fundamentally misaligned with this setting, as it cannot support ad-hoc memory recall while streams unfold. To explore this challenge, we introduce \textbf{STEM-Bench}, the first benchmark for \textbf{ST}reaming \textbf{E}valuation of \textbf{M}emory. It comprises over 14K QA pairs in dialogue streams that assess perception fidelity, temporal reasoning, and global awareness under infinite-horizon constraints. The preliminary analysis on STEM-Bench indicates a critical \textit{fidelity-efficiency dilemma}: retrieval-based methods use fragment context, while full-context models incur unbounded latency. To resolve this, we propose \textbf{ProStream}, a proactive hierarchical memory framework for streaming dialogues. It enables ad-hoc memory recall on demand by reasoning over continuous streams with multi-granular distillation. Moreover, it employs Adaptive Spatiotemporal Optimization to dynamically optimize retention based on expected utility. It enables a bounded knowledge state for lower inference latency without sacrificing reasoning fidelity. Experiments show that ProStream outperforms baselines in both accuracy and efficiency.

标签

对话系统 记忆机制 长程依赖 流式处理

arXiv 分类

cs.AI