Agent Tuning & Optimization 相关度: 6/10

TimelyFreeze: Adaptive Parameter Freezing Mechanism for Pipeline Parallelism

Seonghye Cho, Jaemin Han, Hyunjin Kim, Euisoo Jung, Jae-Gil Lee
arXiv: 2602.05754v1 发布: 2026-02-05 更新: 2026-02-05

AI 摘要

TimelyFreeze自适应参数冻结,优化流水线并行训练,提升吞吐量并保持精度。

主要贡献

  • 提出TimelyFreeze,一种新的参数冻结机制
  • 通过线性规划求解最优冻结比例
  • 显著提升流水线并行训练吞吐量

方法论

将流水线调度建模为有向无环图,通过线性规划计算最优参数冻结比例,在精度约束下最小化执行时间。

原文摘要

Pipeline parallelism enables training models that exceed single-device memory, but practical throughput remains limited by pipeline bubbles. Although parameter freezing can improve training throughput by adaptively skipping backward computation, existing methods often over-freeze parameters, resulting in unnecessary accuracy degradation. To address this issue, we propose TimelyFreeze, which models the pipeline schedule as a directed acyclic graph and solves a linear program to compute optimal freeze ratios that minimize batch execution time under accuracy constraints. Experiments show that TimelyFreeze achieves up to 40% training throughput improvement on LLaMA-8B with comparable accuracy. Overall, it enables faster large-scale model training without compromising convergence and generalizes across diverse pipeline-parallel settings.

标签

流水线并行 参数冻结 线性规划 模型训练优化

arXiv 分类

cs.DC cs.AI