Multimodal Learning 相关度: 9/10

TimeOmni-VL: Unified Models for Time Series Understanding and Generation

Tong Guan, Sheng Pan, Johan Barthelemy, Zhao Li, Yujun Cai, Cesare Alippi, Ming Jin, Shirui Pan

arXiv: 2602.17149v1 发布: 2026-02-19 更新: 2026-02-19

下载 PDF arXiv 页面

AI 摘要

TimeOmni-VL提出了一种视觉中心的时间序列统一模型，用于理解和生成任务，并引入了Bi-TSI和TSUMM-Suite。

主要贡献

提出了TimeOmni-VL框架，统一时间序列理解和生成
引入了保真度双向映射Bi-TSI，实现时间序列和图像之间的转换
构建了TSUMM-Suite数据集，包含理解和生成任务

方法论

利用双向时间序列-图像转换(Bi-TSI)和理解引导的生成方法，结合Chain-of-Thought，提升时间序列的理解和生成能力。

原文摘要

Recent time series modeling faces a sharp divide between numerical generation and semantic understanding, with research showing that generation models often rely on superficial pattern matching, while understanding-oriented models struggle with high-fidelity numerical output. Although unified multimodal models (UMMs) have bridged this gap in vision, their potential for time series remains untapped. We propose TimeOmni-VL, the first vision-centric framework that unifies time series understanding and generation through two key innovations: (1) Fidelity-preserving bidirectional mapping between time series and images (Bi-TSI), which advances Time Series-to-Image (TS2I) and Image-to-Time Series (I2TS) conversions to ensure near-lossless transformations. (2) Understanding-guided generation. We introduce TSUMM-Suite, a novel dataset consists of six understanding tasks rooted in time series analytics that are coupled with two generation tasks. With a calibrated Chain-of-Thought, TimeOmni-VL is the first to leverage time series understanding as an explicit control signal for high-fidelity generation. Experiments confirm that this unified approach significantly improves both semantic understanding and numerical precision, establishing a new frontier for multimodal time series modeling.

arXiv 分类

cs.LG cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类