DReX: An Explainable Deep Learning-based Multimodal Recommendation Framework
AI 摘要
DReX是一个可解释的深度学习多模态推荐框架,通过增量更新优化用户和物品表示。
主要贡献
- 提出了一种统一的多模态推荐框架DReX
- 利用交互级别的多模态反馈增量细化用户和物品表示
- 自动生成用户和物品的可解释关键词画像
方法论
采用门控循环单元(GRU)选择性地将细粒度特征集成到全局表示中,增量更新用户和物品的embedding。
原文摘要
Multimodal recommender systems leverage diverse data sources, such as user interactions, content features, and contextual information, to address challenges like cold-start and data sparsity. However, existing methods often suffer from one or more key limitations: processing different modalities in isolation, requiring complete multimodal data for each interaction during training, or independent learning of user and item representations. These factors contribute to increased complexity and potential misalignment between user and item embeddings. To address these challenges, we propose DReX, a unified multimodal recommendation framework that incrementally refines user and item representations by leveraging interaction-level features from multimodal feedback. Our model employs gated recurrent units to selectively integrate these fine-grained features into global representations. This incremental update mechanism provides three key advantages: (1) simultaneous modeling of both nuanced interaction details and broader preference patterns, (2) eliminates the need for separate user and item feature extraction processes, leading to enhanced alignment in their learned representation, and (3) inherent robustness to varying or missing modalities. We evaluate the performance of the proposed approach on three real-world datasets containing reviews and ratings as interaction modalities. By considering review text as a modality, our approach automatically generates interpretable keyword profiles for both users and items, which supplement the recommendation process with interpretable preference indicators. Experiment results demonstrate that our approach outperforms state-of-the-art methods across all evaluated datasets.