LLM Memory & RAG 相关度: 8/10

GitSearch: Enhancing Community Notes Generation with Gap-Informed Targeted Search

Sahajpreet Singh, Kokil Jaidka, Min-Yen Kan
arXiv: 2602.08945v1 发布: 2026-02-09 更新: 2026-02-09

AI 摘要

GitSearch通过识别信息缺失并检索相关信息,提升社区笔记的生成效果。

主要贡献

  • 提出GitSearch框架,优化社区笔记生成
  • 构建PolBench基准数据集
  • 实验证明GitSearch优于现有方法和人工标注

方法论

GitSearch包含信息缺失识别、实时目标网络检索和平台兼容笔记合成三个阶段。

原文摘要

Community-based moderation offers a scalable alternative to centralized fact-checking, yet it faces significant structural challenges, and existing AI-based methods fail in "cold start" scenarios. To tackle these challenges, we introduce GitSearch (Gap-Informed Targeted Search), a framework that treats human-perceived quality gaps, such as missing context, etc., as first-class signals. GitSearch has a three-stage pipeline: identifying information deficits, executing real-time targeted web-retrieval to resolve them, and synthesizing platform-compliant notes. To facilitate evaluation, we present PolBench, a benchmark of 78,698 U.S. political tweets with their associated Community Notes. We find GitSearch achieves 99% coverage, almost doubling coverage over the state-of-the-art. GitSearch surpasses human-authored helpful notes with a 69% win rate and superior helpfulness scores (3.87 vs. 3.36), demonstrating retrieval effectiveness that balanced the trade-off between scale and quality.

标签

社区笔记 信息检索 信息缺失 自动生成

arXiv 分类

cs.CL cs.CY