Multimodal Learning 相关度: 7/10

Zero-shot System for Automatic Body Region Detection for Volumetric CT and MR Images

Farnaz Khun Jush, Grit Werner, Mark Klemens, Matthias Lenga

arXiv: 2602.08717v1 发布: 2026-02-09 更新: 2026-02-09

下载 PDF arXiv 页面

AI 摘要

提出基于预训练模型的零样本方法，用于CT和MR图像的自动身体区域检测。

主要贡献

提出三种零样本身体区域检测流程。
评估了分割驱动的规则系统、MLLM和分割感知MLLM。
实验证明分割驱动的规则系统性能最佳。

方法论

利用预训练的分割模型和MLLM，结合规则或解剖学知识，实现无需训练的身体区域检测。

原文摘要

Reliable identification of anatomical body regions is a prerequisite for many automated medical imaging workflows, yet existing solutions remain heavily dependent on unreliable DICOM metadata. Current solutions mainly use supervised learning, which limits their applicability in many real-world scenarios. In this work, we investigate whether body region detection in volumetric CT and MR images can be achieved in a fully zero-shot manner by using knowledge embedded in large pre-trained foundation models. We propose and systematically evaluate three training-free pipelines: (1) a segmentation-driven rule-based system leveraging pre-trained multi-organ segmentation models, (2) a Multimodal Large Language Model (MLLM) guided by radiologist-defined rules, and (3) a segmentation-aware MLLM that combines visual input with explicit anatomical evidence. All methods are evaluated on 887 heterogeneous CT and MR scans with manually verified anatomical region labels. The segmentation-driven rule-based approach achieves the strongest and most consistent performance, with weighted F1-scores of 0.947 (CT) and 0.914 (MR), demonstrating robustness across modalities and atypical scan coverage. The MLLM performs competitively in visually distinctive regions, while the segmentation-aware MLLM reveals fundamental limitations.

arXiv 分类

cs.CV cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类