Multimodal Learning 相关度: 10/10

PDA: Text-Augmented Defense Framework for Robust Vision-Language Models against Adversarial Image Attacks

Jingning Xu, Haochen Luo, Chen Liu
arXiv: 2604.01010v1 发布: 2026-04-01 更新: 2026-04-01

AI 摘要

PDA框架通过文本增强提升视觉-语言模型在对抗图像攻击下的鲁棒性,无需训练。

主要贡献

  • 提出PDA框架,提升VLM的鲁棒性
  • 利用文本增强(prompt paraphrasing, question decomposition, consistency aggregation)
  • 无需训练,仅在测试时应用

方法论

PDA通过prompt paraphrasing, question decomposition, and consistency aggregation等文本增强策略,提高模型对对抗样本的鲁棒性。

原文摘要

Vision-language models (VLMs) are vulnerable to adversarial image perturbations. Existing works based on adversarial training against task-specific adversarial examples are computationally expensive and often fail to generalize to unseen attack types. To address these limitations, we introduce Paraphrase-Decomposition-Aggregation (PDA), a training-free defense framework that leverages text augmentation to enhance VLM robustness under diverse adversarial image attacks. PDA performs prompt paraphrasing, question decomposition, and consistency aggregation entirely at test time, thus requiring no modification on the underlying models. To balance robustness and efficiency, we instantiate PDA as invariants that reduce the inference cost while retaining most of its robustness gains. Experiments on multiple VLM architectures and benchmarks for visual question answering, classification, and captioning show that PDA achieves consistent robustness gains against various adversarial perturbations while maintaining competitive clean accuracy, establishing a generic, strong and practical defense framework for VLMs during inference.

标签

对抗攻击 鲁棒性 视觉-语言模型 文本增强

arXiv 分类

cs.CV cs.MM