Multimodal Learning 相关度: 9/10

HyPCA-Net: Advancing Multimodal Fusion in Medical Image Analysis

J. Dhar, M. K. Pandey, D. Chakladar, M. Haghighat, A. Alavi, S. Mistry, N. Zaidi

arXiv: 2602.16245v1 发布: 2026-02-18 更新: 2026-02-18

下载 PDF arXiv 页面

AI 摘要

HyPCA-Net提出了一种混合并行融合的级联注意力网络，用于提升多模态医学图像分析的性能和效率。

主要贡献

提出了计算高效的残差自适应学习注意力模块，用于捕捉精细的模态特定表征。
提出了双视角级联注意力模块，用于学习不同模态之间鲁棒的共享表征。
在十个公开数据集上的实验表明，HyPCA-Net显著优于现有方法，性能提升高达5.2%，计算成本降低高达73.1%。

方法论

HyPCA-Net结合残差自适应学习注意力和双视角级联注意力，分别处理模态特定信息和跨模态共享信息。

原文摘要

Multimodal fusion frameworks, which integrate diverse medical imaging modalities (e.g., MRI, CT), have shown great potential in applications such as skin cancer detection, dementia diagnosis, and brain tumor prediction. However, existing multimodal fusion methods face significant challenges. First, they often rely on computationally expensive models, limiting their applicability in low-resource environments. Second, they often employ cascaded attention modules, which potentially increase risk of information loss during inter-module transitions and hinder their capacity to effectively capture robust shared representations across modalities. This restricts their generalization in multi-disease analysis tasks. To address these limitations, we propose a Hybrid Parallel-Fusion Cascaded Attention Network (HyPCA-Net), composed of two core novel blocks: (a) a computationally efficient residual adaptive learning attention block for capturing refined modality-specific representations, and (b) a dual-view cascaded attention block aimed at learning robust shared representations across diverse modalities. Extensive experiments on ten publicly available datasets exhibit that HyPCA-Net significantly outperforms existing leading methods, with improvements of up to 5.2% in performance and reductions of up to 73.1% in computational cost. Code: https://github.com/misti1203/HyPCA-Net.

arXiv 分类

cs.CV

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类