Safe and Scalable Web Agent Learning via Recreated Websites
AI 摘要
提出VeriEnv框架,通过克隆网站生成可验证的合成环境,安全高效地训练Web Agent。
主要贡献
- 提出VeriEnv框架,用于创建安全可验证的Web Agent训练环境
- 利用语言模型自动克隆真实网站,生成合成环境
- 通过Python SDK控制内部访问,实现任务自生成和程序化奖励
方法论
利用LLM克隆真实网站,构建可控的合成环境,并设计程序化奖励机制进行Agent训练。
原文摘要
Training autonomous web agents is fundamentally limited by the environments they learn from: real-world websites are unsafe to explore, hard to reset, and rarely provide verifiable feedback. We propose VeriEnv, a framework that treats language models as environment creators, automatically cloning real-world websites into fully executable, verifiable synthetic environments. By exposing controlled internal access via a Python SDK, VeriEnv enables agents to self-generate tasks with deterministic, programmatically verifiable rewards, eliminating reliance on heuristic or LLM-based judges. This design decouples agent learning from unsafe real-world interaction while enabling scalable self-evolution through environment expansion. Through experiments on web agent benchmarks, we show that agents trained with VeriEnv generalize to unseen websites, achieve site-specific mastery through self-evolving training, and benefit from scaling the number of training environments. Code and resources will be released at https://github.com/kyle8581/VeriEnv upon acceptance.