ECG-R1: Protocol-Guided and Modality-Agnostic MLLM for Reliable ECG Interpretation
AI 摘要
ECG-R1通过协议引导和模态解耦,提升MLLM在心电图判读的可靠性。
主要贡献
- 提出协议引导的指令数据生成方法
- 设计了模态解耦架构,提高鲁棒性和跨模态一致性
- 使用基于诊断证据的强化学习,增强证据驱动的心电图判读
方法论
构建协议引导的指令数据集,采用模态解耦架构,并使用强化学习优化证据驱动的ECG判读。
原文摘要
Electrocardiography (ECG) serves as an indispensable diagnostic tool in clinical practice, yet existing multimodal large language models (MLLMs) remain unreliable for ECG interpretation, often producing plausible but clinically incorrect analyses. To address this, we propose ECG-R1, the first reasoning MLLM designed for reliable ECG interpretation via three innovations. First, we construct the interpretation corpus using \textit{Protocol-Guided Instruction Data Generation}, grounding interpretation in measurable ECG features and monograph-defined quantitative thresholds and diagnostic logic. Second, we present a modality-decoupled architecture with \textit{Interleaved Modality Dropout} to improve robustness and cross-modal consistency when either the ECG signal or ECG image is missing. Third, we present \textit{Reinforcement Learning with ECG Diagnostic Evidence Rewards} to strengthen evidence-grounded ECG interpretation. Additionally, we systematically evaluate the ECG interpretation capabilities of proprietary, open-source, and medical MLLMs, and provide the first quantitative evidence that severe hallucinations are widespread, suggesting that the public should not directly trust these outputs without independent verification. Code and data are publicly available at \href{https://github.com/PKUDigitalHealth/ECG-R1}{here}, and an online platform can be accessed at \href{http://ai.heartvoice.com.cn/ECG-R1/}{here}.