AI Agents 相关度: 9/10

WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces

Sicheng Fan, Rui Wan, Yifei Leng, Gaoning Liang, Li Ling, Yanyi Shang, Dehan Kong
arXiv: 2603.05295v1 发布: 2026-03-05 更新: 2026-03-05

AI 摘要

WebChain数据集提供大规模真实网页交互轨迹,加速Web Agent研究,并提出Dual Mid-Training方法。

主要贡献

  • 构建大规模人工标注的Web交互数据集WebChain
  • 提出Triple Alignment的多模态监督数据
  • 提出Dual Mid-Training训练策略

方法论

收集真实网页交互数据,进行视觉、结构和动作三重对齐标注,并使用解耦空间定位和规划的Dual Mid-Training。

原文摘要

We introduce WebChain, the largest open-source dataset of human-annotated trajectories on real-world websites, designed to accelerate reproducible research in web agents. It contains 31,725 trajectories and 318k steps, featuring a core Triple Alignment of visual, structural, and action data to provide rich, multi-modal supervision. The data is collected via a scalable pipeline that ensures coverage of complex, high-value tasks often missed by synthetic methods. Leveraging this dataset, we propose a Dual Mid-Training recipe that decouples spatial grounding from planning, achieving state-of-the-art performance on our proposed WebChainBench and other public GUI benchmarks. Our work provides the data and insights necessary to build and rigorously evaluate the next generation of scalable web agents.

标签

Web Agent 数据集 多模态学习 强化学习

arXiv 分类

cs.AI cs.CV