Runtime Governance for AI Agents: Policies on Paths
AI 摘要
论文提出基于执行路径的AI Agent运行时治理框架,以应对Agent行为的不可预测性。
主要贡献
- 提出了基于执行路径的AI Agent治理框架
- 将合规策略形式化为概率函数
- 分析了prompt和访问控制的局限性
方法论
形式化定义了AI Agent的运行时治理,并通过实例展示了策略应用。
原文摘要
AI agents -- systems that plan, reason, and act using large language models -- produce non-deterministic, path-dependent behavior that cannot be fully governed at design time, where with governed we mean striking the right balance between as high as possible successful task completion rate and the legal, data-breach, reputational and other costs associated with running agents. We argue that the execution path is the central object for effective runtime governance and formalize compliance policies as deterministic functions mapping agent identity, partial path, proposed next action, and organizational state to a policy violation probability. We show that prompt-level instructions (and "system prompts"), and static access control are special cases of this framework: the former shape the distribution over paths without actually evaluating them; the latter evaluates deterministic policies that ignore the path (i.e., these can only account for a specific subset of all possible paths). In our view, runtime evaluation is the general case, and it is necessary for any path-dependent policy. We develop the formal framework for analyzing AI agent governance, present concrete policy examples (inspired by the AI act), discuss a reference implementation, and identify open problems including risk calibration and the limits of enforced compliance.