Multimodal Learning 相关度: 9/10

When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm

Ye Leng, Junjie Chu, Mingjie Li, Chenhao Lin, Chao Shen, Michael Backes, Yun Shen, Yang Zhang
arXiv: 2603.24079v1 发布: 2026-03-25 更新: 2026-03-25

AI 摘要

MLLM更强的语义理解能力带来比扩散模型更大的安全风险,包括不安全内容生成和假图片合成。

主要贡献

  • 系统性分析和比较了MLLM和扩散模型在不安全内容生成和假图片合成方面的安全风险。
  • 发现MLLM比扩散模型更容易生成不安全图像,因为MLLM更能理解抽象prompt。
  • 表明MLLM生成的图像更难被现有检测器识别,即使在检测器使用MLLM数据重新训练后。

方法论

通过多个不安全生成benchmark数据集,对比MLLM和扩散模型的生成结果,并评估生成图像的检测难度。

原文摘要

Recently, multimodal large language models (MLLMs) have emerged as a unified paradigm for language and image generation. Compared with diffusion models, MLLMs possess a much stronger capability for semantic understanding, enabling them to process more complex textual inputs and comprehend richer contextual meanings. However, this enhanced semantic ability may also introduce new and potentially greater safety risks. Taking diffusion models as a reference point, we systematically analyze and compare the safety risks of emerging MLLMs along two dimensions: unsafe content generation and fake image synthesis. Across multiple unsafe generation benchmark datasets, we observe that MLLMs tend to generate more unsafe images than diffusion models. This difference partly arises because diffusion models often fail to interpret abstract prompts, producing corrupted outputs, whereas MLLMs can comprehend these prompts and generate unsafe content. For current advanced fake image detectors, MLLM-generated images are also notably harder to identify. Even when detectors are retrained with MLLMs-specific data, they can still be bypassed by simply providing MLLMs with longer and more descriptive inputs. Our measurements indicate that the emerging safety risks of the cutting-edge generative paradigm, MLLMs, have not been sufficiently recognized, posing new challenges to real-world safety.

标签

MLLM 安全风险 图像生成 对抗攻击 深度伪造

arXiv 分类

cs.CV cs.AI cs.CR