ITSC 2025 Paper Abstract

Paper FR-EA-T38.2

Wang, Xiaoyu (University of Toronto), Smirnov, Ilia (University of Toronto), Scott Patrick Sanner, Scott (University of Toronto), Abdulhai, Baher (University of Toronto)

Getting Ready for Deployment: Transformer-Enabled Robust Adaptive Traffic Signal Control

Scheduled for presentation during the Regular Session "S38b-Towards Scalable and Trustworthy AI in Connected Mobility" (FR-EA-T38), Friday, November 21, 2025, 13:50−14:10, Coolangata 2

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on October 18, 2025

Keywords AI, Machine Learning for Dynamic Traffic Signal Control and Optimization, Transportation Optimization Techniques and Multi-modal Urban Mobility

Abstract

Urban traffic signal control must contend with rapidly changing demand patterns, limited sensor coverage, and strict safety regulations --- challenges often overlooked by lab‐focused reinforcement learning (RL) research. We introduce eMARLIN‐Transformer‐Robust, a decentralized multi‐agent RL framework built for real‐world deployment. We frame these issues as partial observability and out‐of‐distribution challenges. To reduce partial observability, each agent employs a Transformer encoder over a brief history of local observations augmented by neighbor embeddings. The architecture captures temporal context and extends agent's effective field of view. To achieve out-of-distribution robustness, we apply domain randomization techniques, exposing agents to diverse traffic scenarios to prevent overfitting.

In a simulated field test spanning 75 out-of-training scenarios (210 runs), eMARLIN-Transformer-Robust reduces total stop delay by 17.7% ~ 27.8% compared to the expert‐tuned industry-standard TransSuite system. An ablation study shows that robust training further improves performance by 12.2% over agents lacking it. These results demonstrate that only the combination of Transformer-based history encoding and robust training yields a single, adaptive policy capable of seamless, plan-free operation across diverse, real-world conditions.