Paper TH-EA-T22.1
Wang, Wanshu (The University of Tokyo), WANG, Wei (The University of Tokyo), Nakano, Kimihiko (The University of Tokyo)
The Potential of Large Language Model to Enhance Multi-Agent Traffic Signal Control in Complex Environments
Scheduled for presentation during the Invited Session "S22b-Emerging Trends in AV Research" (TH-EA-T22), Thursday, November 20, 2025,
13:30−13:50, Coolangata 1
2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia
This information is tentative and subject to change. Compiled on October 18, 2025
|
|
Keywords AI, Machine Learning for Dynamic Traffic Signal Control and Optimization, AI, Machine Learning and Predictive Analytics for Traffic Incident Detection and Management, Digital Twin Modeling for ITS Infrastructure and Traffic Simulation
Abstract
In dynamic and complex environments, particularly when rare but realistic emergent traffic events occur, enabling existing Traffic Signal Control (TSC) systems to respond flexibly and maintain reliability remains a substantial challenge. Given the ability of Large Language Models (LLMs) to exhibit human-level reasoning through chain-of-thought (CoT) processes, they show great potential in solving complex tasks. To this end, this study proposes a multi-agent proximal policy optimization (MAPPO) framework integrated with the LLM, where the LLM evaluates and refines the initial decisions made by MAPPO agents. This approach addresses the inherent limitation of conventional Multi-Agent Reinforcement Learning (MARL)-based TSC systems, which optimize solely for reward functions and often neglect anomalous environmental states. Experiments are conducted in a 2×2 grid network, where three types of emergent traffic events: emergency vehicles, sensor failure, and roadblock are respectively designed and investigated. Experimental results demonstrate that, compared to standalone MAPPO method, the proposed MAPPO-LLM collaborative framework achieves remarkable reductions in both average travel time and average waiting time across various emergent traffic scenarios. These findings highlight the effectiveness of incorporating LLMs in enhancing the adaptability and robustness of MARL-based TSC systems in the presence of real-world traffic emergencies.
|
|