TimelyFreeze: Adaptive Parameter Freezing Mechanism for Pipeline Parallelism
AI 摘要
TimelyFreeze自适应参数冻结,优化流水线并行训练,提升吞吐量并保持精度。
主要贡献
- 提出TimelyFreeze,一种新的参数冻结机制
- 通过线性规划求解最优冻结比例
- 显著提升流水线并行训练吞吐量
方法论
将流水线调度建模为有向无环图,通过线性规划计算最优参数冻结比例,在精度约束下最小化执行时间。
原文摘要
Pipeline parallelism enables training models that exceed single-device memory, but practical throughput remains limited by pipeline bubbles. Although parameter freezing can improve training throughput by adaptively skipping backward computation, existing methods often over-freeze parameters, resulting in unnecessary accuracy degradation. To address this issue, we propose TimelyFreeze, which models the pipeline schedule as a directed acyclic graph and solves a linear program to compute optimal freeze ratios that minimize batch execution time under accuracy constraints. Experiments show that TimelyFreeze achieves up to 40% training throughput improvement on LLaMA-8B with comparable accuracy. Overall, it enables faster large-scale model training without compromising convergence and generalizes across diverse pipeline-parallel settings.