LLM Reasoning 相关度: 8/10

Routing, Cascades, and User Choice for LLMs

Rafid Mahmood

arXiv: 2602.09902v1 发布: 2026-02-10 更新: 2026-02-10

下载 PDF arXiv 页面

AI 摘要

研究LLM路由策略对用户行为的影响，揭示提供者与用户之间的潜在利益冲突。

主要贡献

提出了LLM提供者和用户之间的Stackelberg博弈模型
刻画了用户最佳响应策略和简化了提供者问题
揭示了提供者最优路由与用户偏好路由之间的不一致性

方法论

构建Stackelberg博弈模型，分析提供者和用户在不同路由策略下的收益和成本，求解最优策略。

原文摘要

To mitigate the trade-offs between performance and costs, LLM providers route user tasks to different models based on task difficulty and latency. We study the effect of LLM routing with respect to user behavior. We propose a game between an LLM provider with two models (standard and reasoning) and a user who can re-prompt or abandon tasks if the routed model cannot solve them. The user's goal is to maximize their utility minus the delay from using the model, while the provider minimizes the cost of servicing the user. We solve this Stackelberg game by fully characterizing the user best response and simplifying the provider problem. We observe that in nearly all cases, the optimal routing policy involves a static policy with no cascading that depends on the expected utility of the models to the user. Furthermore, we reveal a misalignment gap between the provider-optimal and user-preferred routes when the user's and provider's rankings of the models with respect to utility and cost differ. Finally, we demonstrate conditions for extreme misalignment where providers are incentivized to throttle the latency of the models to minimize their costs, consequently depressing user utility. The results yield simple threshold rules for single-provider, single-user interactions and clarify when routing, cascading, and throttling help or harm.

arXiv 分类

cs.GT cs.AI cs.LG

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类