AI Agents 相关度: 6/10

Toward Scalable Automated Repository-Level Datasets for Software Vulnerability Detection

Amine Lbath

arXiv: 2603.17974v1 发布: 2026-03-18 更新: 2026-03-18

下载 PDF arXiv 页面

AI 摘要

提出一种自动化的仓库级漏洞基准生成方法，用于训练和评估漏洞检测模型。

主要贡献

自动化生成仓库级漏洞数据集
注入现实漏洞并生成可复现的PoV
对抗性协同进化提升检测鲁棒性

方法论

构建自动基准生成器，通过注入漏洞到真实仓库并生成PoV，进行对抗性训练。

原文摘要

Software vulnerabilities continue to grow in volume and remain difficult to detect in practice. Although learning-based vulnerability detection has progressed, existing benchmarks are largely function-centric and fail to capture realistic, executable, interprocedural settings. Recent repo-level security benchmarks demonstrate the importance of realistic environments, but their manual curation limits scale. This doctoral research proposes an automated benchmark generator that injects realistic vulnerabilities into real-world repositories and synthesizes reproducible proof-of-vulnerability (PoV) exploits, enabling precisely labeled datasets for training and evaluating repo-level vulnerability detection agents. We further investigate an adversarial co-evolution loop between injection and detection agents to improve robustness under realistic constraints.

arXiv 分类

cs.SE cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类