Next Embedding Prediction Makes World Models Stronger
AI 摘要
NE-Dreamer利用时序Transformer预测嵌入,提升了模型在复杂环境中的表现。
主要贡献
- 提出了一种新的无解码器的MBRL代理NE-Dreamer
- 利用时序Transformer预测下一时刻的嵌入
- 在DeepMind Control Suite和DMLab上取得了优异的结果
方法论
NE-Dreamer通过时序Transformer直接优化表征空间中的时序预测对齐,避免了重建损失和辅助监督。
原文摘要
Capturing temporal dependencies is critical for model-based reinforcement learning (MBRL) in partially observable, high-dimensional domains. We introduce NE-Dreamer, a decoder-free MBRL agent that leverages a temporal transformer to predict next-step encoder embeddings from latent state sequences, directly optimizing temporal predictive alignment in representation space. This approach enables NE-Dreamer to learn coherent, predictive state representations without reconstruction losses or auxiliary supervision. On the DeepMind Control Suite, NE-Dreamer matches or exceeds the performance of DreamerV3 and leading decoder-free agents. On a challenging subset of DMLab tasks involving memory and spatial reasoning, NE-Dreamer achieves substantial gains. These results establish next-embedding prediction with temporal transformers as an effective, scalable framework for MBRL in complex, partially observable environments.