BUSSARD: Normalizing Flows for Bijective Universal Scene-Specific Anomalous Relationship Detection
AI 摘要
BUSSARD利用归一化流检测场景图中的异常关系,性能优于现有方法并具备更强的鲁棒性。
主要贡献
- 提出了一种基于归一化流的异常关系检测模型BUSSARD
- 在SARD数据集上取得了比SOTA更好的AUROC结果
- 模型具有更好的鲁棒性和泛化性
方法论
使用语言模型嵌入场景图中的对象和关系,通过归一化流学习双射变换,将三元组映射到高斯分布进行异常检测。
原文摘要
We propose Bijective Universal Scene-Specific Anomalous Relationship Detection (BUSSARD), a normalizing flow-based model for detecting anomalous relations in scene graphs, generated from images. Our work follows a multimodal approach, embedding object and relationship tokens from scene graphs with a language model to leverage semantic knowledge from the real world. A normalizing flow model is used to learn bijective transformations that map object-relation-object triplets from scene graphs to a simple base distribution (typically Gaussian), allowing anomaly detection through likelihood estimation. We evaluate our approach on the SARD dataset containing office and dining room scenes. Our method achieves around 10% better AUROC results compared to the current state-of-the-art model, while simultaneously being five times faster. Through ablation studies, we demonstrate superior robustness and universality, particularly regarding the use of synonyms, with our model maintaining stable performance while the baseline shows 17.5% deviation. This work demonstrates the strong potential of learning-based methods for relationship anomaly detection in scene graphs. Our code is available at https://github.com/mschween/BUSSARD .