LLM Reasoning 相关度: 6/10

FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics

Yunhua Zhong, Yixuan Tang, Yifan Li, Jie Yang, Pan Liu, Jun Xia
arXiv: 2602.22822v1 发布: 2026-02-26 更新: 2026-02-26

AI 摘要

FlexMS是一个用于评估代谢组学中深度学习质谱预测工具的灵活基准框架。

主要贡献

  • 构建质谱预测的基准框架FlexMS
  • 支持多种模型架构的动态构建和评估
  • 分析影响性能的因素并提供实用指导

方法论

创建基准框架,使用预处理的公共数据集,并使用不同指标评估模型性能,进行交叉验证分析。

原文摘要

The identification and property prediction of chemical molecules is of central importance in the advancement of drug discovery and material science, where the tandem mass spectrometry technology gives valuable fragmentation cues in the form of mass-to-charge ratio peaks. However, the lack of experimental spectra hinders the attachment of each molecular identification, and thus urges the establishment of prediction approaches for computational models. Deep learning models appear promising for predicting molecular structure spectra, but overall assessment remains challenging as a result of the heterogeneity in methods and the lack of well-defined benchmarks. To address this, our contribution is the creation of benchmark framework FlexMS for constructing and evaluating diverse model architectures in mass spectrum prediction. With its easy-to-use flexibility, FlexMS supports the dynamic construction of numerous distinct combinations of model architectures, while assessing their performance on preprocessed public datasets using different metrics. In this paper, we provide insights into factors influencing performance, including the structural diversity of datasets, hyperparameters like learning rate and data sparsity, pretraining effects, metadata ablation settings and cross-domain transfer learning analysis. This provides practical guidance in choosing suitable models. Moreover, retrieval benchmarks simulate practical identification scenarios and score potential matches based on predicted spectra.

标签

质谱预测 代谢组学 深度学习 基准测试 药物发现

arXiv 分类

cs.AI cs.LG