Multimodal Learning 相关度: 9/10

Modality-Guided Mixture of Graph Experts with Entropy-Triggered Routing for Multimodal Recommendation

Ji Dai, Quan Fang, Dengsheng Cai
arXiv: 2602.20723v1 发布: 2026-02-24 更新: 2026-02-24

AI 摘要

提出MAGNET模型,通过模态引导的图专家网络和熵触发路由,提升多模态推荐效果。

主要贡献

  • 提出模态引导的图专家网络MAGNET
  • 引入交互条件专家路由和结构感知图增强
  • 设计两阶段熵权重机制稳定路由

方法论

利用图神经网络融合用户-物品交互信息和物品内容特征,通过专家网络和熵触发路由实现模态的自适应融合。

原文摘要

Multimodal recommendation enhances ranking by integrating user-item interactions with item content, which is particularly effective under sparse feedback and long-tail distributions. However, multimodal signals are inherently heterogeneous and can conflict in specific contexts, making effective fusion both crucial and challenging. Existing approaches often rely on shared fusion pathways, leading to entangled representations and modality imbalance. To address these issues, we propose \textbf{MAGNET}, a \textbf{M}odality-Guided Mixture of \textbf{A}daptive \textbf{G}raph Experts \textbf{N}etwork with Progressive \textbf{E}ntropy-\textbf{T}riggered Routing for Multimodal Recommendation, designed to enhance controllability, stability, and interpretability in multimodal fusion. MAGNET couples interaction-conditioned expert routing with structure-aware graph augmentation, so that both \emph{what} to fuse and \emph{how} to fuse are explicitly controlled and interpretable. At the representation level, a dual-view graph learning module augments the interaction graph with content-induced edges, improving coverage for sparse and long-tail items while preserving collaborative structure via parallel encoding and lightweight fusion. At the fusion level, MAGNET employs structured experts with explicit modality roles -- dominant, balanced, and complementary -- enabling a more interpretable and adaptive combination of behavioral, visual, and textual cues. To further stabilize sparse routing and prevent expert collapse, we introduce a two-stage entropy-weighting mechanism that monitors routing entropy. This mechanism automatically transitions training from an early coverage-oriented regime to a later specialization-oriented regime, progressively balancing expert utilization and routing confidence. Extensive experiments on public benchmarks demonstrate consistent improvements over strong baselines.

标签

多模态推荐 图神经网络 专家网络 熵正则化

arXiv 分类

cs.AI