Leveraging Large Language Models for Causal Discovery: a Constraint-based, Argumentation-driven Approach
AI 摘要
利用LLM作为不完美的专家,结合因果ABA框架进行因果发现,并提出评估协议。
主要贡献
- 提出了使用LLM作为因果ABA中语义结构先验来源的方法
- 结合条件独立性证据提升因果发现性能
- 提出了减轻LLM因果发现评估中记忆偏差的评估协议
方法论
结合LLM提取的语义先验和条件独立性证据,通过因果ABA框架进行因果图构建。
原文摘要
Causal discovery seeks to uncover causal relations from data, typically represented as causal graphs, and is essential for predicting the effects of interventions. While expert knowledge is required to construct principled causal graphs, many statistical methods have been proposed to leverage observational data with varying formal guarantees. Causal Assumption-based Argumentation (ABA) is a framework that uses symbolic reasoning to ensure correspondence between input constraints and output graphs, while offering a principled way to combine data and expertise. We explore the use of large language models (LLMs) as imperfect experts for Causal ABA, eliciting semantic structural priors from variable names and descriptions and integrating them with conditional-independence evidence. Experiments on standard benchmarks and semantically grounded synthetic graphs demonstrate state-of-the-art performance, and we additionally introduce an evaluation protocol to mitigate memorisation bias when assessing LLMs for causal discovery.