Agent Tuning & Optimization 相关度: 7/10

Continual uncertainty learning

Heisei Yonezawa, Ansei Yonezawa, Itsuro Kajiwara

arXiv: 2602.17174v1 发布: 2026-02-19 更新: 2026-02-19

下载 PDF arXiv 页面

AI 摘要

提出了基于课程学习的持续不确定性学习框架，用于解决复杂非线性系统的鲁棒控制问题。

主要贡献

提出了一种新的持续学习框架，用于处理多重不确定性
将复杂控制问题分解为一系列持续学习任务
结合了模型预测控制和深度强化学习以提高学习效率

方法论

采用课程学习的方式，逐步增加不确定性的复杂程度，并结合模型预测控制加速深度强化学习的收敛。

原文摘要

Robust control of mechanical systems with multiple uncertainties remains a fundamental challenge, particularly when nonlinear dynamics and operating-condition variations are intricately intertwined. While deep reinforcement learning (DRL) combined with domain randomization has shown promise in mitigating the sim-to-real gap, simultaneously handling all sources of uncertainty often leads to sub-optimal policies and poor learning efficiency. This study formulates a new curriculum-based continual learning framework for robust control problems involving nonlinear dynamical systems in which multiple sources of uncertainty are simultaneously superimposed. The key idea is to decompose a complex control problem with multiple uncertainties into a sequence of continual learning tasks, in which strategies for handling each uncertainty are acquired sequentially. The original system is extended into a finite set of plants whose dynamic uncertainties are gradually expanded and diversified as learning progresses. The policy is stably updated across the entire plant sets associated with tasks defined by different uncertainty configurations without catastrophic forgetting. To ensure learning efficiency, we jointly incorporate a model-based controller (MBC), which guarantees a shared baseline performance across the plant sets, into the learning process to accelerate the convergence. This residual learning scheme facilitates task-specific optimization of the DRL agent for each uncertainty, thereby enhancing sample efficiency. As a practical industrial application, this study applies the proposed method to designing an active vibration controller for automotive powertrains. We verified that the resulting controller is robust against structural nonlinearities and dynamic variations, realizing successful sim-to-real transfer.

arXiv 分类

cs.LG cs.AI eess.SY

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类