AI Agents 相关度: 9/10

PersonaTrace: Synthesizing Realistic Digital Footprints with LLM Agents

Minjia Wang, Yunfeng Wang, Xiao Ma, Dexin Lv, Qifan Guo, Lynn Zheng, Benliang Wang, Lei Wang, Jiannan Li, Yongwei Xing, David Xu, Zheng Sun
arXiv: 2603.11955v1 发布: 2026-03-12 更新: 2026-03-12

AI 摘要

利用LLM Agent合成真实数字足迹,解决数据稀缺问题,提升模型在真实任务上的表现。

主要贡献

  • 提出 PersonaTrace 方法,生成真实数字足迹
  • 合成数据集更具多样性和真实性
  • 基于合成数据微调的模型在真实任务上表现更优

方法论

从结构化的用户画像出发,利用 LLM Agent 生成用户事件序列,并生成对应的数字 artifacts。

原文摘要

Digital footprints (records of individuals' interactions with digital systems) are essential for studying behavior, developing personalized applications, and training machine learning models. However, research in this area is often hindered by the scarcity of diverse and accessible data. To address this limitation, we propose a novel method for synthesizing realistic digital footprints using large language model (LLM) agents. Starting from a structured user profile, our approach generates diverse and plausible sequences of user events, ultimately producing corresponding digital artifacts such as emails, messages, calendar entries, reminders, etc. Intrinsic evaluation results demonstrate that the generated dataset is more diverse and realistic than existing baselines. Moreover, models fine-tuned on our synthetic data outperform those trained on other synthetic datasets when evaluated on real-world out-of-distribution tasks.

标签

LLM Agent 数据合成 数字足迹 行为建模

arXiv 分类

cs.CL