Agent Tuning & Optimization 相关度: 8/10

PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra

Xiachong Feng, Liang Zhao, Weihong Zhong, Yichong Huang, Yuxuan Gu, Lingpeng Kong, Xiaocheng Feng, Bing Qin
arXiv: 2602.15669v1 发布: 2026-02-17 更新: 2026-02-17

AI 摘要

PERSONA框架通过激活向量代数实现LLM动态且可组合的个性化控制,无需微调。

主要贡献

  • 提出PERSONA框架,实现LLM个性化控制
  • 通过激活向量代数实现动态和可组合的个性化
  • 无需微调,性能接近微调

方法论

通过Persona-Base提取正交的特征向量,Persona-Algebra进行向量运算,Persona-Flow实现上下文感知的动态组合。

原文摘要

Current methods for personality control in Large Language Models rely on static prompting or expensive fine-tuning, failing to capture the dynamic and compositional nature of human traits. We introduce PERSONA, a training-free framework that achieves fine-tuning level performance through direct manipulation of personality vectors in activation space. Our key insight is that personality traits appear as extractable, approximately orthogonal directions in the model's representation space that support algebraic operations. The framework operates through three stages: Persona-Base extracts orthogonal trait vectors via contrastive activation analysis; Persona-Algebra enables precise control through vector arithmetic (scalar multiplication for intensity, addition for composition, subtraction for suppression); and Persona-Flow achieves context-aware adaptation by dynamically composing these vectors during inference. On PersonalityBench, our approach achieves a mean score of 9.60, nearly matching the supervised fine-tuning upper bound of 9.61 without any gradient updates. On our proposed Persona-Evolve benchmark for dynamic personality adaptation, we achieve up to 91% win rates across diverse model families. These results provide evidence that aspects of LLM personality are mathematically tractable, opening new directions for interpretable and efficient behavioral control.

标签

LLM personality control activation vector algebraic operations

arXiv 分类

cs.AI