A Robust and Efficient Multi-Agent Reinforcement Learning Framework for Traffic Signal Control
AI 摘要
提出了一种鲁棒高效的交通信号灯控制多智能体强化学习框架,提升了泛化性和稳定性。
主要贡献
- Turning Ratio Randomization训练策略
- stability-oriented Exponential Phase Duration Adjustment动作空间
- Neighbor-Based Observation scheme + MAPPO算法 (CTDE)
方法论
通过随机化训练数据、设计合理的动作空间和利用邻居信息,结合MAPPO算法进行多智能体强化学习。
原文摘要
Reinforcement Learning (RL) in Traffic Signal Control (TSC) faces significant hurdles in real-world deployment due to limited generalization to dynamic traffic flow variations. Existing approaches often overfit static patterns and use action spaces incompatible with driver expectations. This paper proposes a robust Multi-Agent Reinforcement Learning (MARL) framework validated in the Vissim traffic simulator. The framework integrates three mechanisms: (1) Turning Ratio Randomization, a training strategy that exposes agents to dynamic turning probabilities to enhance robustness against unseen scenarios; (2) a stability-oriented Exponential Phase Duration Adjustment action space, which balances responsiveness and precision through cyclical, exponential phase adjustments; and (3) a Neighbor-Based Observation scheme utilizing the MAPPO algorithm with Centralized Training with Decentralized Execution (CTDE). By leveraging centralized updates, this approach approximates the efficacy of global observations while maintaining scalable local communication. Experimental results demonstrate that our framework outperforms standard RL baselines, reducing average waiting time by over 10%. The proposed model exhibits superior generalization in unseen traffic scenarios and maintains high control stability, offering a practical solution for adaptive signal control.