Linear Convergence in Games with Delayed Feedback via Extra Prediction
AI 摘要
该论文提出通过引入额外乐观的WOGDA算法来加速延迟反馈博弈中的线性收敛。
主要贡献
- 分析了延迟反馈下WOGDA算法的线性收敛速率
- 提出了额外乐观的WOGDA算法以加速收敛
- 通过实验验证了额外乐观策略的有效性
方法论
将WOGDA算法解释为EPP的近似,并基于更远的未来奖励进行更新,从而分析算法的收敛性。
原文摘要
Feedback delays are inevitable in real-world multi-agent learning. They are known to severely degrade performance, and the convergence rate under delayed feedback is still unclear, even for bilinear games. This paper derives the rate of linear convergence of Weighted Optimistic Gradient Descent-Ascent (WOGDA), which predicts future rewards with extra optimism, in unconstrained bilinear games. To analyze the algorithm, we interpret it as an approximation of the Extra Proximal Point (EPP), which is updated based on farther future rewards than the classical Proximal Point (PP). Our theorems show that standard optimism (predicting the next-step reward) achieves linear convergence to the equilibrium at a rate $\exp(-Θ(t/m^{5}))$ after $t$ iterations for delay $m$. Moreover, employing extra optimism (predicting farther future reward) tolerates a larger step size and significantly accelerates the rate to $\exp(-Θ(t/(m^{2}\log m)))$. Our experiments also show accelerated convergence driven by the extra optimism and are qualitatively consistent with our theorems. In summary, this paper validates that extra optimism is a promising countermeasure against performance degradation caused by feedback delays.