Multimodal Learning 相关度: 9/10

Reason-IAD: Knowledge-Guided Dynamic Latent Reasoning for Explainable Industrial Anomaly Detection

Peng Chen, Chao Huang, Yunkang Cao, Chengliang Liu, Wenqiang Wang, Mingbo Yang, Li Shen, Wenqi Ren, Xiaochun Cao

arXiv: 2602.09850v1 发布: 2026-02-10 更新: 2026-02-10

下载 PDF arXiv 页面

AI 摘要

Reason-IAD通过知识引导和动态推理提升工业异常检测的准确性和可解释性。

主要贡献

提出了一个知识引导的检索增强模块，融入领域知识。
设计了一个基于熵的潜在推理机制，鼓励稳定预测。
实现了动态视觉注入策略，关注异常检测关键区域。

方法论

构建包含检索增强知识模块、熵驱动潜在推理机制和动态视觉注入策略的框架，进行工业异常检测。

原文摘要

Industrial anomaly detection demands precise reasoning over fine-grained defect patterns. However, existing multimodal large language models (MLLMs), pretrained on general-domain data, often struggle to capture category-specific anomalies, thereby limiting both detection accuracy and interpretability. To address these limitations, we propose Reason-IAD, a knowledge-guided dynamic latent reasoning framework for explainable industrial anomaly detection. Reason-IAD comprises two core components. First, a retrieval-augmented knowledge module incorporates category-specific textual descriptions into the model input, enabling context-aware reasoning over domain-specific defects. Second, an entropy-driven latent reasoning mechanism conducts iterative exploration within a compact latent space using optimizable latent think tokens, guided by an entropy-based reward that encourages confident and stable predictions. Furthermore, a dynamic visual injection strategy selectively incorporates the most informative image patches into the latent sequence, directing the reasoning process toward regions critical for anomaly detection. Extensive experimental results demonstrate that Reason-IAD consistently outperforms state-of-the-art methods. The code will be publicly available at https://github.com/chenpeng052/Reason-IAD.

arXiv 分类

cs.CV

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类