Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots
AI 摘要
该论文提出通过不变的推理快照建模专家AI诊断对齐框架,提高临床决策支持系统的人工对齐评估。
主要贡献
- 提出诊断对齐框架,使用不变推理状态
- 结合VLM、BERT和SLMI进行医学实体提取和推理
- 通过多层次一致性评估验证框架有效性
方法论
使用vision-enabled LLM,BERT进行实体提取,SLMI进行推理,并进行多层次一致性评估(PMR, AMR, CCR)。
原文摘要
Human-in-the-loop validation is essential in safety-critical clinical AI, yet the transition between initial model inference and expert correction is rarely analyzed as a structured signal. We introduce a diagnostic alignment framework in which the AI-generated image based report is preserved as an immutable inference state and systematically compared with the physician-validated outcome. The inference pipeline integrates a vision-enabled large language model, BERT- based medical entity extraction, and a Sequential Language Model Inference (SLMI) step to enforce domain-consistent refinement prior to expert review. Evaluation on 21 dermatological cases (21 complete AI physician pairs) em- ployed a four-level concordance framework comprising exact primary match rate (PMR), semantic similarity-adjusted rate (AMR), cross-category alignment, and Comprehensive Concordance Rate (CCR). Exact agreement reached 71.4% and remained unchanged under semantic similarity (t = 0.60), while structured cross-category and differential overlap analysis yielded 100% comprehensive concordance (95% CI: [83.9%, 100%]). No cases demonstrated complete diagnostic divergence. These findings show that binary lexical evaluation substantially un- derestimates clinically meaningful alignment. Modeling expert validation as a structured transformation enables signal-aware quantification of correction dynamics and supports traceable, human aligned evaluation of image based clinical decision support systems.