LLM Reasoning 相关度: 7/10

Good-Enough LLM Obfuscation (GELO)

Anatoly Belikov, Ilya Fedotov
arXiv: 2603.05035v1 发布: 2026-03-05 更新: 2026-03-05

AI 摘要

GELO是一种轻量级LLM混淆方法,通过动态混合隐藏状态,保护推理过程中的prompt隐私。

主要贡献

  • 提出GELO混淆方法,保护LLM推理隐私
  • 设计了非正交和正交混合两种防御策略
  • 实验证明GELO在Llama-2 7B上有效且开销可接受

方法论

采用基于随机可逆矩阵的动态混合方法,在每次推理时对隐藏状态进行混淆,防止信息泄露。

原文摘要

Large Language Models (LLMs) are increasingly served on shared accelerators where an adversary with read access to device memory can observe KV caches and hidden states, threatening prompt privacy for open-source models. Cryptographic protections such as MPC and FHE offer strong guarantees but remain one to two orders of magnitude too slow for interactive inference, while static obfuscation schemes break under multi-run statistical attacks once the model is known. We present GELO (Good-Enough LLM Obfuscation), a lightweight protocol for privacy-preserving inference that limits information leakage from untrusted accelerator observations by hiding hidden states with fresh, per-batch invertible mixing. For each offloaded projection, the TEE samples a random matrix A, forms $U = AH$, offloads U and weights W to the accelerator, and then applies $A^-1$ on return, so that $A^-1 ((AH)W ) = HW$ and outputs are unchanged. Because mixing is never reused across batches, the attacker faces only a single-batch blind source separation problem. We analyze information leakage and introduce two practical defenses: (i) non-orthogonal mixing to mask Gram matrices, and (ii) orthogonal mixing augmented with a small fraction of high-energy "shield" vectors that pollute higher-order statistics. On Llama-2 7B, GELO preserves float32 outputs exactly, closely matches low-precision baselines, offloads the dominant matrix multiplications with about 20-30% latency overhead, and defeats a range of ICA/BSS and anchor-based attacks.

标签

LLM 隐私保护 模型混淆 安全推理

arXiv 分类

cs.CR cs.LG