AI Agents 相关度: 7/10

CRoSS: A Continual Robotic Simulation Suite for Scalable Reinforcement Learning with High Task Diversity and Realistic Physics Simulation

Yannick Denker, Alexander Gepperth
arXiv: 2602.04868v1 发布: 2026-02-04 更新: 2026-02-04

AI 摘要

CRoSS是基于Gazebo的连续机器人强化学习benchmark套件,具有高任务多样性和物理真实感。

主要贡献

  • 提出了新的连续机器人强化学习基准CRoSS
  • 基于Gazebo模拟器,提供两个机器人平台和多种任务场景
  • 提供易于扩展、可复现的容器化环境和标准RL算法的性能报告

方法论

通过构建包含多种机器人和任务的仿真环境,评估和比较不同强化学习算法在连续学习场景下的性能。

原文摘要

Continual reinforcement learning (CRL) requires agents to learn from a sequence of tasks without forgetting previously acquired policies. In this work, we introduce a novel benchmark suite for CRL based on realistically simulated robots in the Gazebo simulator. Our Continual Robotic Simulation Suite (CRoSS) benchmarks rely on two robotic platforms: a two-wheeled differential-drive robot with lidar, camera and bumper sensor, and a robotic arm with seven joints. The former represent an agent in line-following and object-pushing scenarios, where variation of visual and structural parameters yields a large number of distinct tasks, whereas the latter is used in two goal-reaching scenarios with high-level cartesian hand position control (modeled after the Continual World benchmark), and low-level control based on joint angles. For the robotic arm benchmarks, we provide additional kinematics-only variants that bypass the need for physical simulation (as long as no sensor readings are required), and which can be run two orders of magnitude faster. CRoSS is designed to be easily extensible and enables controlled studies of continual reinforcement learning in robotic settings with high physical realism, and in particular allow the use of almost arbitrary simulated sensors. To ensure reproducibility and ease of use, we provide a containerized setup (Apptainer) that runs out-of-the-box, and report performances of standard RL algorithms, including Deep Q-Networks (DQN) and policy gradient methods. This highlights the suitability as a scalable and reproducible benchmark for CRL research.

标签

机器人 强化学习 连续学习 仿真

arXiv 分类

cs.LG cs.AI