Grounding LTL Tasks in Sub-Symbolic RL Environments for Zero-Shot Generalization
AI 摘要
提出了一种在子符号环境中学习LTL任务的强化学习方法,实现零样本泛化。
主要贡献
- 联合训练多任务策略和符号接地器
- 使用神经奖励机进行半监督学习
- 在视觉环境中实现了与使用真实符号接地相当的性能
方法论
联合训练一个多任务策略和一个符号接地器,使用来自原始观察和稀疏奖励的神经奖励机。
原文摘要
In this work we address the problem of training a Reinforcement Learning agent to follow multiple temporally-extended instructions expressed in Linear Temporal Logic in sub-symbolic environments. Previous multi-task work has mostly relied on knowledge of the mapping between raw observations and symbols appearing in the formulae. We drop this unrealistic assumption by jointly training a multi-task policy and a symbol grounder with the same experience. The symbol grounder is trained only from raw observations and sparse rewards via Neural Reward Machines in a semi-supervised fashion. Experiments on vision-based environments show that our method achieves performance comparable to using the true symbol grounding and significantly outperforms state-of-the-art methods for sub-symbolic environments.