AI Agents 相关度: 7/10

DataJoint 2.0: A Computational Substrate for Agentic Scientific Workflows

Dimitri Yatsenko, Thinh T. Nguyen

arXiv: 2602.16585v1 发布: 2026-02-18 更新: 2026-02-18

下载 PDF arXiv 页面

AI 摘要

DataJoint 2.0构建了一个用于科学工作流的计算基础，实现可查询、可执行和机器可读的SciOps。

主要贡献

关系工作流模型
对象增强模式
语义匹配
可扩展类型系统

方法论

通过关系型数据库和扩展技术，统一数据结构、数据和计算转换，实现可控的科学工作流。

原文摘要

Operational rigor determines whether human-agent collaboration succeeds or fails. Scientific data pipelines need the equivalent of DevOps -- SciOps -- yet common approaches fragment provenance across disconnected systems without transactional guarantees. DataJoint 2.0 addresses this gap through the relational workflow model: tables represent workflow steps, rows represent artifacts, foreign keys prescribe execution order. The schema specifies not only what data exists but how it is derived -- a single formal system where data structure, computational dependencies, and integrity constraints are all queryable, enforceable, and machine-readable. Four technical innovations extend this foundation: object-augmented schemas integrating relational metadata with scalable object storage, semantic matching using attribute lineage to prevent erroneous joins, an extensible type system for domain-specific formats, and distributed job coordination designed for composability with external orchestration. By unifying data structure, data, and computational transformations, DataJoint creates a substrate for SciOps where agents can participate in scientific workflows without risking data corruption.

arXiv 分类

cs.DB cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类