ITSC 2025 Paper Abstract

Paper TH-LM-T25.5

Liang, Qingyi (Shenzhen Institutes of Advanced Technology, Chinese Academy of S), Jiang, Zhengmin (City University of Hong Kong), Li, Rixin (Southern University of Science and Technology), Peng, Lei (Shenzhen Institutes of Advanced Technology,Chinese Academy of Sc), Wen, Lihua (Guangzhou Maritime University), Sun, Tianfu (Shenzhen Institutes of Advanced Technology, Chinese Academy of S), Liu, Jia (Shenzhen Institute of Advanced Technology Chinese Academy of Sci), Li, Huiyun (Shenzhen University of Advanced Technology)

Learnable Adversarial Training for Robust and Compatible Multi-Agent Reinforcement Learning in Multi-Vehicle Coordination

Scheduled for presentation during the Regular Session "S25a-Cooperative and Connected Autonomous Systems" (TH-LM-T25), Thursday, November 20, 2025, 11:50−12:10, Cooleangata 4

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on October 18, 2025

Keywords Cooperative Driving Systems and Vehicle Coordination in Multi-vehicle Scenarios, Traffic Management for Autonomous Multi-vehicle Operations

Abstract

Applying reinforcement learning to autonomous driving is beneficial to enhance decision-making efficiency and adaptability in dynamic traffic scenarios. Nevertheless, robustness still poses a significant challenge. Current methods, including adversarial training and policy regularization, assess stability by perturbing states, actions, and rewards. These approaches often rely on fixed or trained adversary policies, which have a limited diversity of attacks and impact the generalization of policies. Inspired by adversarial robustness in computer vision, we propose a novel adversarial training framework called Learnable Attack Strategy (LAS). In this framework, the target network in reinforcement learning and the perturbation network engage in an adversarial training loop, where the policy strives to defend increasingly challenging perturbations. Furthermore, LAS is compatible with various reinforcement learning methods, enhancing overall robustness. Through iterative training, the final policy achieves stable performance and mitigates degradation across different perturbations. Experiments conducted in the MetaDrive environment demonstrate that our method outperforms both baseline and state-of-the-art approaches in terms of robustness.