LLM Memory & RAG 相关度: 7/10

Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion

Sonia Laguna, Jorge da Silva Goncalves, Moritz Vandenhirtz, Alain Ryser, Irene Cannistraci, Julia E. Vogt
arXiv: 2603.15033v1 发布: 2026-03-16 更新: 2026-03-16

AI 摘要

提出一种“设计即遗忘”的机器卸载新范式,通过密钥删除实现高效、零样本的遗忘能力。

主要贡献

  • 提出“设计即遗忘”的卸载范式
  • 设计了MUNKEY模型,通过密钥删除实现卸载
  • 实验证明MUNKEY在多种数据集上优于现有方法

方法论

设计一种内存增强的Transformer,将实例特定记忆与模型权重解耦,通过删除实例密钥实现卸载。

原文摘要

Machine unlearning is rapidly becoming a practical requirement, driven by privacy regulations, data errors, and the need to remove harmful or corrupted training samples. Despite this, most existing methods tackle the problem purely from a post-hoc perspective. They attempt to erase the influence of targeted training samples through parameter updates that typically require access to the full training data. This creates a mismatch with real deployment scenarios where unlearning requests can be anticipated, revealing a fundamental limitation of post-hoc approaches. We propose \textit{unlearning by design}, a novel paradigm in which models are directly trained to support forgetting as an inherent capability. We instantiate this idea with Machine UNlearning via KEY deletion (MUNKEY), a memory augmented transformer that decouples instance-specific memorization from model weights. Here, unlearning corresponds to removing the instance-identifying key, enabling direct zero-shot forgetting without weight updates or access to the original samples or labels. Across natural image benchmarks, fine-grained recognition, and medical datasets, MUNKEY outperforms all post-hoc baselines. Our results establish that unlearning by design enables fast, deployment-oriented unlearning while preserving predictive performance.

标签

machine unlearning privacy transformer key deletion

arXiv 分类

cs.LG