Multimodal Learning 相关度: 9/10

MMKU-Bench: A Multimodal Update Benchmark for Diverse Visual Knowledge

Baochen Fu, Yuntao Du, Cheng Chang, Baihao Jin, Wenzhi Deng, Muhao Xu, Hongmei Yan, Weiye Song, Yi Wan

arXiv: 2603.15117v1 发布: 2026-03-16 更新: 2026-03-16

下载 PDF arXiv 页面

AI 摘要

提出MMKU-Bench，一个用于多模态知识更新的综合评估基准，包含更新知识和未知知识两种场景。

主要贡献

构建了一个多模态知识更新的综合评估基准MMKU-Bench
涵盖更新知识和未知知识两种场景，促进不同知识类型学习的比较分析
评估了多种代表性方法，揭示了现有方法在知识更新方面的局限性

方法论

构建包含25k知识实例和49k图像的数据集，用于评估监督微调、强化学习和知识编辑等方法在多模态知识更新上的表现。

原文摘要

As real-world knowledge continues to evolve, the parametric knowledge acquired by multimodal models during pretraining becomes increasingly difficult to remain consistent with real-world knowledge. Existing research on multimodal knowledge updating focuses only on learning previously unknown knowledge, while overlooking the need to update knowledge that the model has already mastered but that later changes; moreover, evaluation is limited to the same modality, lacking a systematic analysis of cross-modal consistency. To address these issues, this paper proposes MMKU-Bench, a comprehensive evaluation benchmark for multimodal knowledge updating, which contains over 25k knowledge instances and more than 49k images, covering two scenarios, updated knowledge and unknown knowledge, thereby enabling comparative analysis of learning across different knowledge types. On this benchmark, we evaluate a variety of representative approaches, including supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF), and knowledge editing (KE). Experimental results show that SFT and RLHF are prone to catastrophic forgetting, while KE better preserve general capabilities but exhibit clear limitations in continual updating. Overall, MMKU-Bench provides a reliable and comprehensive evaluation benchmark for multimodal knowledge updating, advancing progress in this field.

arXiv 分类

cs.CL

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类