Agent Tuning & Optimization 相关度: 8/10

VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization

Weixin Liu, Congning Ni, Qingyuan Song, Susannah L. Rose, Christopher Symons, Murat Kantarcioglu, Bradley A. Malin, Zhijun Yin
arXiv: 2603.10494v1 发布: 2026-03-11 更新: 2026-03-11

AI 摘要

VERI-DPO通过声明验证和直接偏好优化提升临床摘要的真实性。

主要贡献

  • 提出VERI-DPO框架,提升临床摘要真实性
  • 利用声明验证挖掘偏好并进行DPO优化
  • 在MIMIC-III-Ext-VeriFact-BHC数据集上验证了有效性

方法论

使用检索增强验证器标记声明-证据对,挖掘偏好,通过DPO优化摘要生成。

原文摘要

Brief Hospital Course (BHC) narratives must be clinically useful yet faithful to fragmented EHR evidence. LLM-based clinical summarizers still introduce unsupported statements, and alignment can encourage omissions ("say-less" degeneration). We introduce VERI-DPO, which uses claim verification to mine preferences and distill them into the summarizer with Direct Preference Optimization (DPO). On MIMIC-III-Ext-VeriFact-BHC (100 ICU patients; patient-level splits), we train a retrieval-augmented verifier to label claim-evidence pairs as Supported, Not Supported, or Not Addressed via a single-token format. The verifier scores sentence-level claims from sampled BHC candidates and aggregates margins into a coverage-aware utility to mine length-controlled, contradiction-anchored preference pairs. On held-out patients, verifier-mined preferences separate candidates by contradiction density, and VERI-DPO reduces Not Supported claim rates from 10.7% to 1.9% (local verifier judge) and from 11.6% to 6.4% (GPT-4o judge), while improving validity from 76.7% to 82.5% and maintaining informative length.

标签

临床摘要 声明验证 直接偏好优化 大型语言模型

arXiv 分类

cs.CL cs.LG