AI-Generated Prior Authorization Letters: Strong Clinical Content, Weak Administrative Scaffolding
AI 摘要
评估LLM生成事前授权信函的能力,发现临床内容强但行政支撑薄弱,尚不能直接用于实际应用。
主要贡献
- 系统性评估LLM生成事前授权信函的能力
- 揭示LLM在行政细节处理方面的不足
- 提出LLM部署的关键挑战在于行政精度
方法论
使用GPT-4o, Claude Sonnet 4.5, Gemini 2.5 Pro三种LLM,在45个医学场景下生成信函,并进行临床和行政分析。
原文摘要
Prior authorization remains one of the most burdensome administrative processes in U.S. healthcare, consuming billions of dollars and thousands of physician hours each year. While large language models have shown promise across clinical text tasks, their ability to produce submission-ready prior authorization letters has received only limited attention, with existing work confined to single-case demonstrations rather than structured multi-scenario evaluation. We assessed three commercially available LLMs (GPT-4o, Claude Sonnet 4.5, and Gemini 2.5 Pro) across 45 physician-validated synthetic scenarios spanning rheumatology, psychiatry, oncology, cardiology, and orthopedics. All three models generated letters with strong clinical content: accurate diagnoses, well-structured medical necessity arguments, and thorough step therapy documentation. However, a secondary analysis of real-world administrative requirements revealed consistent gaps that clinical scoring alone did not capture, including absent billing codes, missing authorization duration requests, and inadequate follow-up plans. These findings reframe the question: the challenge for clinical deployment is not whether LLMs can write clinically adequate letters, but whether the systems built around them can supply the administrative precision that payer workflows require.