Multimodal Learning 相关度: 9/10

ALOOD: Exploiting Language Representations for LiDAR-based Out-of-Distribution Object Detection

Michael Kösel, Marcel Schreiber, Michael Ulrich, Claudius Gläser, Klaus Dietmayer
arXiv: 2603.08180v1 发布: 2026-03-09 更新: 2026-03-09

AI 摘要

提出ALOOD方法,利用语言表示进行LiDAR OOD目标检测,提升自动驾驶安全性。

主要贡献

  • 提出基于语言表示的LiDAR OOD目标检测方法ALOOD
  • 将OOD检测转化为zero-shot分类任务
  • 在nuScenes OOD数据集上取得了竞争性的表现

方法论

将LiDAR目标检测器的特征与VLM的特征空间对齐,实现OOD目标的zero-shot分类。

原文摘要

LiDAR-based 3D object detection plays a critical role for reliable and safe autonomous driving systems. However, existing detectors often produce overly confident predictions for objects not belonging to known categories, posing significant safety risks. This is caused by so-called out-of-distribution (OOD) objects, which were not part of the training data, resulting in incorrect predictions. To address this challenge, we propose ALOOD (Aligned LiDAR representations for Out-Of-Distribution Detection), a novel approach that incorporates language representations from a vision-language model (VLM). By aligning the object features from the object detector to the feature space of the VLM, we can treat the detection of OOD objects as a zero-shot classification task. We demonstrate competitive performance on the nuScenes OOD benchmark, establishing a novel approach to OOD object detection in LiDAR using language representations. The source code is available at https://github.com/uulm-mrm/mmood3d.

标签

LiDAR OOD Detection Vision-Language Model Zero-Shot Learning Autonomous Driving

arXiv 分类

cs.CV cs.LG