EI: Early Intervention for Multimodal Imaging based Disease Recognition
AI 摘要
提出一种用于多模态医学图像疾病识别的早期干预框架,解决信息融合和数据稀缺问题。
主要贡献
- 提出早期干预(EI)框架,利用参考模态指导目标模态嵌入
- 提出低秩混合适配(MoR)方法,高效微调视觉基础模型
- 在多个医学图像数据集上验证了方法的有效性
方法论
利用参考模态的语义信息早期干预目标模态的嵌入过程,并采用低秩适配方法进行视觉基础模型微调。
原文摘要
Current methods for multimodal medical imaging based disease recognition face two major challenges. First, the prevailing "fusion after unimodal image embedding" paradigm cannot fully leverage the complementary and correlated information in the multimodal data. Second, the scarcity of labeled multimodal medical images, coupled with their significant domain shift from natural images, hinders the use of cutting-edge Vision Foundation Models (VFMs) for medical image embedding. To jointly address the challenges, we propose a novel Early Intervention (EI) framework. Treating one modality as target and the rest as reference, EI harnesses high-level semantic tokens from the reference as intervention tokens to steer the target modality's embedding process at an early stage. Furthermore, we introduce Mixture of Low-varied-Ranks Adaptation (MoR), a parameter-efficient fine-tuning method that employs a set of low-rank adapters with varied ranks and a weight-relaxed router for VFM adaptation. Extensive experiments on three public datasets for retinal disease, skin lesion, and keen anomaly classification verify the effectiveness of the proposed method against a number of competitive baselines.