Multimodal Learning 相关度: 8/10

UAV-CB: A Complex-Background RGB-T Dataset and Local Frequency Bridge Network for UAV Detection

Shenghui Huang, Menghao Hu, Longkun Zou, Hongyu Chi, Zekai Li, Feng Gao, Fan Yang, Qingyao Wu, Ke Chen
arXiv: 2603.17492v1 发布: 2026-03-18 更新: 2026-03-18

AI 摘要

提出针对复杂背景下无人机检测的RGB-T数据集UAV-CB和局部频率桥网络LFBNet。

主要贡献

  • 构建了新的RGB-T无人机检测数据集UAV-CB
  • 提出了局部频率桥网络LFBNet,用于RGB-T融合
  • 在UAV-CB和公共数据集上验证了LFBNet的有效性

方法论

构建RGB-T数据集,并设计LFBNet在局部频率空间建模特征,实现频率-空间和跨模态融合。

原文摘要

Detecting Unmanned Aerial Vehicles (UAVs) in low-altitude environments is essential for perception and defense systems but remains highly challenging due to complex backgrounds, camouflage, and multimodal interference. In real-world scenarios, UAVs are frequently visually blended with surrounding structures such as buildings, vegetation, and power lines, resulting in low contrast, weak boundaries, and strong confusion with cluttered background textures. Existing UAV detection datasets, though diverse, are not specifically designed to capture these camouflage and complex-background challenges, which limits progress toward robust real-world perception. To fill this gap, we construct UAV-CB, a new RGB-T UAV detection dataset deliberately curated to emphasize complex low-altitude backgrounds and camouflage characteristics. Furthermore, we propose the Local Frequency Bridge Network (LFBNet), which models features in localized frequency space to bridge both the frequency-spatial fusion gap and the cross-modality discrepancy gap in RGB-T fusion. Extensive experiments on UAV-CB and public benchmarks demonstrate that LFBNet achieves state-of-the-art detection performance and strong robustness under camouflaged and cluttered conditions, offering a frequency-aware perspective on multimodal UAV perception in real-world applications.

标签

无人机检测 RGB-T 数据集 频率域

arXiv 分类

cs.CV