Efficient Controller Learning from Human Preferences and Numerical Data Via Multi-Modal Surrogate Models
AI 摘要
提出了一种融合数值数据和人类偏好的多模态贝叶斯优化框架,用于高效控制器学习。
主要贡献
- 提出了多模态贝叶斯优化框架
- 利用高斯过程代理模型整合不同置信度数据
- 验证了在自动驾驶车辆轨迹规划中的有效性
方法论
采用高斯过程代理模型,结合分层自回归和核心区域化结构,高效学习混合模态数据,进行贝叶斯优化。
原文摘要
Tuning control policies manually to meet high-level objectives is often time-consuming. Bayesian optimization provides a data-efficient framework for automating this process using numerical evaluations of an objective function. However, many systems, particularly those involving humans, require optimization based on subjective criteria. Preferential Bayesian optimization addresses this by learning from pairwise comparisons instead of quantitative measurements, but relying solely on preference data can be inefficient. We propose a multi-fidelity, multi-modal Bayesian optimization framework that integrates low-fidelity numerical data with high-fidelity human preferences. Our approach employs Gaussian process surrogate models with both hierarchical, autoregressive and non-hierarchical, coregionalization-based structures, enabling efficient learning from mixed-modality data. We illustrate the framework by tuning an autonomous vehicle's trajectory planner, showing that combining numerical and preference data significantly reduces the need for experiments involving the human decision maker while effectively adapting driving style to individual preferences.