Multimodal Learning 相关度: 8/10

Multimodal Connectome Fusion via Cross-Attention for Autism Spectrum Disorder Classification Using Graph Learning

Ansar Rahman, Hassan Shojaee-Mend, Sepideh Hatamikia

arXiv: 2603.15168v1 发布: 2026-03-16 更新: 2026-03-16

下载 PDF arXiv 页面

AI 摘要

论文提出一种基于图学习和交叉注意力的多模态融合框架，用于自闭症谱系障碍分类。

主要贡献

提出基于图学习的多模态融合框架
引入非对称Transformer交叉注意力机制
ABIDE-I数据集上的实验验证

方法论

构建个体图，提取功能和结构特征，使用Edge Variational GCN学习嵌入，通过交叉注意力融合多模态信息，用于ASD分类。

原文摘要

Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by atypical functional brain connectivity and subtle structural alterations. rs-fMRI has been widely used to identify disruptions in large-scale brain networks, while structural MRI provides complementary information about morphological organization. Despite their complementary nature, effectively integrating these heterogeneous imaging modalities within a unified framework remains challenging. This study proposes a multimodal graph learning framework that preserves the dominant role of functional connectivity while integrating structural imaging and phenotypic information for ASD classification. The proposed framework is evaluated on ABIDE-I dataset. Each subject is represented as a node within a population graph. Functional and structural features are extracted as modality-specific node attributes, while inter-subject relationships are modeled using a pairwise association encoder (PAE) based on phenotypic information. Two Edge Variational GCNs are trained to learn subject-level embeddings. To enable effective multimodal integration, we introduce a novel asymmetric transformer-based cross-attention mechanism that allows functional embeddings to selectively incorporate complementary structural information while preserving functional dominance. The fused embeddings are then passed to a MLP for ASD classification. Using stratified 10-fold cross-validation, the framework achieved an AUC of 87.3% and an accuracy of 84.4%. Under leave-one-site-out cross-validation (LOSO-CV), the model achieved an average cross-site accuracy of 82.0%, outperforming existing methods by approximately 3% under 10-fold cross-validation and 7% under LOSO-CV. The proposed framework effectively integrates heterogeneous multimodal data from the multi-site ABIDE-I dataset, improving automated ASD classification across imaging sites.

arXiv 分类

cs.CV cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类