AI Agents 相关度: 7/10

ManeuverNet: A Soft Actor-Critic Framework for Precise Maneuvering of Double-Ackermann-Steering Robots with Optimized Reward Functions

Kohio Deflesselle, Mélodie Daniel, Aly Magassouba, Miguel Aranda, Olivier Ly

arXiv: 2602.14726v1 发布: 2026-02-16 更新: 2026-02-16

下载 PDF arXiv 页面

AI 摘要

ManeuverNet利用强化学习提升双阿克曼转向机器人的精准操控能力。

主要贡献

提出ManeuverNet框架，结合SAC和CrossQ
设计了适用于操控学习的四种奖励函数
实验证明优于DRL基线和TEB planner

方法论

采用Soft Actor-Critic算法，针对双阿克曼系统定制，并设计优化奖励函数进行强化学习。

原文摘要

Autonomous control of double-Ackermann-steering robots is essential in agricultural applications, where robots must execute precise and complex maneuvers within a limited space. Classical methods, such as the Timed Elastic Band (TEB) planner, can address this problem, but they rely on parameter tuning, making them highly sensitive to changes in robot configuration or environment and impractical to deploy without constant recalibration. At the same time, end-to-end deep reinforcement learning (DRL) methods often fail due to unsuitable reward functions for non-holonomic constraints, resulting in sub-optimal policies and poor generalization. To address these challenges, this paper presents ManeuverNet, a DRL framework tailored for double-Ackermann systems, combining Soft Actor-Critic with CrossQ. Furthermore, ManeuverNet introduces four specifically designed reward functions to support maneuver learning. Unlike prior work, ManeuverNet does not depend on expert data or handcrafted guidance. We extensively evaluate ManeuverNet against both state-of-the-art DRL baselines and the TEB planner. Experimental results demonstrate that our framework substantially improves maneuverability and success rates, achieving more than a 40% gain over DRL baselines. Moreover, ManeuverNet effectively mitigates the strong parameter sensitivity observed in the TEB planner. In real-world trials, ManeuverNet achieved up to a 90% increase in maneuvering trajectory efficiency, highlighting its robustness and practical applicability.

arXiv 分类

cs.RO cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类