LLM Reasoning 相关度: 7/10

Talk2DM: Enabling Natural Language Querying and Commonsense Reasoning for Vehicle-Road-Cloud Integrated Dynamic Maps with Large Language Models

Lu Tao, Jinxuan Luo, Yousuke Watanabe, Zhengshu Zhou, Yuhuan Lu, Shen Ying, Pan Zhang, Fei Zhao, Hiroaki Takada
arXiv: 2602.11860v1 发布: 2026-02-12 更新: 2026-02-12

AI 摘要

提出Talk2DM,一个基于大语言模型的车辆-道路-云集成动态地图自然语言查询和常识推理模块。

主要贡献

  • 构建了VRC合作感知仿真框架VRCsim。
  • 创建了VRC-QA问答数据集,专注于混合交通场景的空间查询和推理。
  • 提出了Talk2DM模块,通过CoP机制增强DM系统的自然语言交互能力。

方法论

使用VRCsim生成数据,构建VRC-QA数据集,利用CoP机制融合人工规则和LLM常识构建Talk2DM。

原文摘要

Dynamic maps (DM) serve as the fundamental information infrastructure for vehicle-road-cloud (VRC) cooperative autonomous driving in China and Japan. By providing comprehensive traffic scene representations, DM overcome the limitations of standalone autonomous driving systems (ADS), such as physical occlusions. Although DM-enhanced ADS have been successfully deployed in real-world applications in Japan, existing DM systems still lack a natural-language-supported (NLS) human interface, which could substantially enhance human-DM interaction. To address this gap, this paper introduces VRCsim, a VRC cooperative perception (CP) simulation framework designed to generate streaming VRC-CP data. Based on VRCsim, we construct a question-answering data set, VRC-QA, focused on spatial querying and reasoning in mixed-traffic scenes. Building upon VRCsim and VRC-QA, we further propose Talk2DM, a plug-and-play module that extends VRC-DM systems with NLS querying and commonsense reasoning capabilities. Talk2DM is built upon a novel chain-of-prompt (CoP) mechanism that progressively integrates human-defined rules with the commonsense knowledge of large language models (LLMs). Experiments on VRC-QA show that Talk2DM can seamlessly switch across different LLMs while maintaining high NLS query accuracy, demonstrating strong generalization capability. Although larger models tend to achieve higher accuracy, they incur significant efficiency degradation. Our results reveal that Talk2DM, powered by Qwen3:8B, Gemma3:27B, and GPT-oss models, achieves over 93\% NLS query accuracy with an average response time of only 2-5 seconds, indicating strong practical potential.

标签

Large Language Models Autonomous Driving Natural Language Processing Dynamic Maps

arXiv 分类

cs.AI