T3D: Few-Step Diffusion Language Models via Trajectory Self-Distillation with Direct Discriminative Optimization
AI 摘要
提出T3D框架,通过轨迹自蒸馏和DDO优化,提升扩散语言模型少步解码的生成质量和效率。
主要贡献
- 提出基于轨迹自蒸馏的少步解码优化框架T3D
- 引入DDO(Direct Discriminative Optimization)来促进模态寻找蒸馏
- 在多个基准测试上优于现有少步解码基线
方法论
利用模型自身的生成轨迹进行自蒸馏,并通过DDO目标函数优化,使学生模型专注于高概率教师模型输出。
原文摘要
Diffusion large language models (DLLMs) have the potential to enable fast text generation by decoding multiple tokens in parallel. However, in practice, their inference efficiency is constrained by the need for many refinement steps, while aggressively reducing the number of steps leads to a substantial degradation in generation quality. To alleviate this, we propose a trajectory self-distillation framework that improves few-step decoding by distilling the model's own generative trajectories. We incorporate Direct Discriminative Optimization (DDO), a reverse-KL objective that promotes mode-seeking distillation and encourages the student to concentrate on high-probability teacher modes. Across benchmarks, our approach consistently outperforms strong few-step baselines and standard training under tight step budgets. Although full-step decoding remains superior, we substantially narrow the gap, establishing a strong foundation towards practical few-step DLLMs. The source code is available at https://github.com/Tyrion58/T3D.