Agent Tuning & Optimization 相关度: 6/10

LeakBoost: Perceptual-Loss-Based Membership Inference Attack

Amit Kravchik Taub, Fred M. Grabovski, Guy Amit, Yisroel Mirsky
arXiv: 2602.05748v1 发布: 2026-02-05 更新: 2026-02-05

AI 摘要

LeakBoost通过感知损失主动探测模型,增强成员推理攻击的效果。

主要贡献

  • 提出了LeakBoost框架,利用感知损失优化输入
  • 显著提升了成员推理攻击的成功率
  • 详细分析了不同参数对攻击效果的影响

方法论

LeakBoost通过优化感知损失合成图像,放大成员和非成员的表征差异,再用现有检测器进行判断。

原文摘要

Membership inference attacks (MIAs) aim to determine whether a sample was part of a model's training set, posing serious privacy risks for modern machine-learning systems. Existing MIAs primarily rely on static indicators, such as loss or confidence, and do not fully leverage the dynamic behavior of models when actively probed. We propose LeakBoost, a perceptual-loss-based interrogation framework that actively probes a model's internal representations to expose hidden membership signals. Given a candidate input, LeakBoost synthesizes an interrogation image by optimizing a perceptual (activation-space) objective, amplifying representational differences between members and non-members. This image is then analyzed by an off-the-shelf membership detector, without modifying the detector itself. When combined with existing membership inference methods, LeakBoost achieves substantial improvements at low false-positive rates across multiple image classification datasets and diverse neural network architectures. In particular, it raises AUC from near-chance levels (0.53-0.62) to 0.81-0.88, and increases TPR at 1 percent FPR by over an order of magnitude compared to strong baseline attacks. A detailed sensitivity analysis reveals that deeper layers and short, low-learning-rate optimization produce the strongest leakage, and that improvements concentrate in gradient-based detectors. LeakBoost thus offers a modular and computationally efficient way to assess privacy risks in white-box settings, advancing the study of dynamic membership inference.

标签

membership inference attack privacy machine learning security perceptual loss

arXiv 分类

cs.AI