Is Clinical Text Enough? A Multimodal Study on Mortality Prediction in Heart Failure Patients
AI 摘要
研究了多模态Transformer在心衰患者短期死亡率预测中的应用,并对比了LLM的效果。
主要贡献
- 评估了text-only, structured-only, multimodal, LLM等多种方法在心衰死亡率预测中的性能
- 证明了实体级别的文本表示增强了CLS嵌入的预测效果
- 发现有监督的多模态融合效果最佳,LLM效果不稳定
方法论
基于Transformer的模型,对比不同模态数据(文本、结构化数据)及其融合方法,以及LLM的prompt方法。
原文摘要
Accurate short-term mortality prediction in heart failure (HF) remains challenging, particularly when relying on structured electronic health record (EHR) data alone. We evaluate transformer-based models on a French HF cohort, comparing text-only, structured-only, multimodal, and LLM-based approaches. Our results show that enriching clinical text with entity-level representations improves prediction over CLS embeddings alone, and that supervised multimodal fusion of text and structured variables achieves the best overall performance. In contrast, large language models perform inconsistently across modalities and decoding strategies, with text-only prompts outperforming structured or multimodal inputs. These findings highlight that entity-aware multimodal transformers offer the most reliable solution for short-term HF outcome prediction, while current LLM prompting remains limited for clinical decision support.