LLM Memory & RAG 相关度: 6/10

One-for-All: A Lightweight Stabilized and Parameter-Efficient Pre-trained LLM for Time Series Forecasting

Prasanjit Dey, Soumyabrata Dev, Bianca Schoen-Phelan
arXiv: 2603.29756v1 发布: 2026-03-31 更新: 2026-03-31

AI 摘要

提出一种参数高效、稳定且轻量级的预训练LLM,用于时间序列预测。

主要贡献

  • 提出Gaussian Rank-Stabilized Low-Rank Adapters (rsLoRA)用于参数高效微调
  • 引入数学上可证明的秩稳定机制,实现梯度稳定
  • 在多个时间序列任务上实现了最先进的效率-准确性权衡

方法论

通过rsLoRA微调冻结的LLM,仅训练位置嵌入和输出层中的低秩分解矩阵,降低参数量和内存占用。

原文摘要

We address the challenge of adapting pre-trained Large Language Models (LLMs) for multivariate time-series analysis, where their deployment is often hindered by prohibitive computational and memory demands. Our solution, One-for-All, introduces Gaussian Rank-Stabilized Low-Rank Adapters (rsLoRA) to enable parameter-efficient fine-tuning of frozen LLMs. While inspired by LoRA, rsLoRA introduces a mathematically grounded rank-stabilization mechanism that enables provable gradient stability at low ranks a novel contribution absent in prior PEFT methods. Our framework injects trainable rank decomposition matrices (rank 16) into positional embeddings and output layers, while keeping self-attention weights fixed. This design reduces trainable parameters by 6.8$\times$ (vs. TimesNet), 21$\times$ (vs. GPT4TS), and 11.8$\times$ (vs. TIME-LLM), while achieving a 168-1,776$\times$ smaller memory footprint (2.2MiB vs. 340MiB-4.18GiB in SOTA models). Rigorous evaluation across six time-series tasks demonstrates that One-for-All achieves state-of-the-art efficiency-accuracy trade-offs: 5.5$\times$ higher parameter efficiency (MSE=5.50) than TimesNet and 21$\times$ better than GPT4TS, while matching their forecasting accuracy (MSE=0.33). The framework's stability is validated through consistent performance across diverse horizons (96-720 steps) and datasets (ETT, Weather, M3, M4), with 98.3% fewer parameters than conventional transformers. These advances enable deployment on edge devices for healthcare, finance, and environmental monitoring without compromising performance.

标签

时间序列预测 预训练LLM 参数高效微调 LoRA 低秩适配器

arXiv 分类

cs.LG