AI Agents 相关度: 8/10

What Matters for Scalable and Robust Learning in End-to-End Driving Planners?

David Holtz, Niklas Hanselmann, Simon Doll, Marius Cordts, Bernt Schiele
arXiv: 2603.15185v1 发布: 2026-03-16 更新: 2026-03-16

AI 摘要

论文重新审视了端到端驾驶架构,提出了高性能且可扩展的BevAD架构。

主要贡献

  • 系统地分析了影响端到端驾驶闭环性能的关键架构模式。
  • 揭示了这些模式的意外限制和未被充分利用的协同效应。
  • 提出了一个轻量级且高度可扩展的端到端驾驶架构BevAD。

方法论

通过实验重新评估常见架构模式对闭环性能的影响,并基于分析结果设计新的架构。

原文摘要

End-to-end autonomous driving has gained significant attention for its potential to learn robust behavior in interactive scenarios and scale with data. Popular architectures often build on separate modules for perception and planning connected through latent representations, such as bird's eye view feature grids, to maintain end-to-end differentiability. This paradigm emerged mostly on open-loop datasets, with evaluation focusing not only on driving performance, but also intermediate perception tasks. Unfortunately, architectural advances that excel in open-loop often fail to translate to scalable learning of robust closed-loop driving. In this paper, we systematically re-examine the impact of common architectural patterns on closed-loop performance: (1) high-resolution perceptual representations, (2) disentangled trajectory representations, and (3) generative planning. Crucially, our analysis evaluates the combined impact of these patterns, revealing both unexpected limitations as well as underexplored synergies. Building on these insights, we introduce BevAD, a novel lightweight and highly scalable end-to-end driving architecture. BevAD achieves 72.7% success rate on the Bench2Drive benchmark and demonstrates strong data-scaling behavior using pure imitation learning. Our code and models are publicly available here: https://dmholtz.github.io/bevad/

标签

autonomous driving end-to-end learning imitation learning architecture design

arXiv 分类

cs.RO cs.AI cs.CV