AI Agents 相关度: 7/10

Behavioral Fingerprints for LLM Endpoint Stability and Identity

Jonah Leshin, Manish Shah, Ian Timmis, Daniel Kang

arXiv: 2603.19022v1 发布: 2026-03-19 更新: 2026-03-19

下载 PDF arXiv 页面

AI 摘要

提出Stability Monitor系统，通过行为指纹识别LLM端点稳定性和身份变化。

主要贡献

提出Stability Monitor系统
利用输出分布差异检测模型变化
评估了不同提供商的模型稳定性

方法论

通过固定prompt集合采样模型输出，计算输出分布的能量距离，使用permutation-test检测变化事件。

原文摘要

The consistency of AI-native applications depends on the behavioral consistency of the model endpoints that power them. Traditional reliability metrics such as uptime, latency and throughput do not capture behavioral change, and an endpoint can remain "healthy" while its effective model identity changes due to updates to weights, tokenizers, quantization, inference engines, kernels, caching, routing, or hardware. We introduce Stability Monitor, a black-box stability monitoring system that periodically fingerprints an endpoint by sampling outputs from a fixed prompt set and comparing the resulting output distributions over time. Fingerprints are compared using a summed energy distance statistic across prompts, with permutation-test p-values as evidence of distribution shift aggregated sequentially to detect change events and define stability periods. In controlled validation, Stability Monitor detects changes to model family, version, inference stack, quantization, and behavioral parameters. In real-world monitoring of the same model hosted by multiple providers, we observe substantial provider-to-provider and within-provider stability differences.

arXiv 分类

cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类