Neural ODE and SDE Models for Adaptation and Planning in Model-Based Reinforcement Learning
AI 摘要
利用神经ODE和SDE模型,在基于模型的强化学习中处理随机动态环境的适应和规划问题。
主要贡献
- 证明神经SDE模型能有效捕捉过渡动态的随机性
- 利用逆模型实现对环境动态变化的有效策略适应
- 提出基于GAN的潜在SDE模型处理部分可观测性
方法论
使用神经ODE和SDE建模环境动态,结合逆模型进行策略适应,使用GAN训练的潜在SDE模型处理部分可观测性。
原文摘要
We investigate neural ordinary and stochastic differential equations (neural ODEs and SDEs) to model stochastic dynamics in fully and partially observed environments within a model-based reinforcement learning (RL) framework. Through a sequence of simulations, we show that neural SDEs more effectively capture the inherent stochasticity of transition dynamics, enabling high-performing policies with improved sample efficiency in challenging scenarios. We leverage neural ODEs and SDEs for efficient policy adaptation to changes in environment dynamics via inverse models, requiring only limited interactions with the new environment. To address partial observability, we introduce a latent SDE model that combines an ODE with a GAN-trained stochastic component in latent space. Policies derived from this model provide a strong baseline, outperforming or matching general model-based and model-free approaches across stochastic continuous-control benchmarks. This work demonstrates the applicability of action-conditional latent SDEs for RL planning in environments with stochastic transitions. Our code is available at: https://github.com/ChaoHan-UoS/NeuralRL