ContextClaim: A Context-Driven Paradigm for Verifiable Claim Detection
AI 摘要
ContextClaim通过引入外部知识,提升可验证声明检测的性能,并在不同数据集和模型上进行了评估。
主要贡献
- 提出ContextClaim范式,将检索引入声明检测阶段
- 利用Wikipedia检索上下文信息,辅助声明可验证性判断
- 在不同数据集和模型上验证了上下文增强的有效性
方法论
从声明中提取实体,检索Wikipedia相关信息,利用大语言模型生成上下文摘要,用于声明分类。
原文摘要
Verifiable claim detection asks whether a claim expresses a factual statement that can, in principle, be assessed against external evidence. As an early filtering stage in automated fact-checking, it plays an important role in reducing the burden on downstream verification components. However, existing approaches to claim detection, whether based on check-worthiness or verifiability, rely solely on the claim text itself. This is a notable limitation for verifiable claim detection in particular, where determining whether a claim is checkable may benefit from knowing what entities and events it refers to and whether relevant information exists to support verification. Inspired by the established role of evidence retrieval in later-stage claim verification, we propose Context-Driven Claim Detection (ContextClaim), a paradigm that advances retrieval to the detection stage. ContextClaim extracts entity mentions from the input claim, retrieves relevant information from Wikipedia as a structured knowledge source, and employs large language models to produce concise contextual summaries for downstream classification. We evaluate ContextClaim on two datasets covering different topics and text genres, the CheckThat! 2022 COVID-19 Twitter dataset and the PoliClaim political debate dataset, across encoder-only and decoder-only models under fine-tuning, zero-shot, and few-shot settings. Results show that context augmentation can improve verifiable claim detection, although its effectiveness varies across domains, model architectures, and learning settings. Through component analysis, human evaluation, and error analysis, we further examine when and why the retrieved context contributes to more reliable verifiability judgments.