ITSC 2024 Paper Abstract

Paper WeBT1.4

Xu, Jianye (Chair of Embedded Software (Informatik 11), RWTH Aachen Universi), Hu, Pan (Chair of Embedded Software (Informatik 11), RWTH Aachen Universi), Alrifaee, Bassam (University of the Bundeswehr Munich)

SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

Scheduled for presentation during the Invited Session "Learning-empowered Intelligent Transportation Systems: Foundation Vehicles and Coordination Technique II" (WeBT1), Wednesday, September 25, 2024, 15:30−15:50, Salon 1

2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), September 24- 27, 2024, Edmonton, Canada

This information is tentative and subject to change. Compiled on June 30, 2025

Keywords Cooperative Techniques and Systems, Automated Vehicle Operation, Motion Planning, Navigation, Multi-autonomous Vehicle Studies, Models, Techniques and Simulations

Abstract

This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same scenarios seen during training. Various methods have been proposed to address these challenges, including experience replay and regularization. However, how observation design in RL affects sample efficiency and generalization remains an under-explored area. We address this gap by proposing five strategies to design information-dense observations, focusing on general features that are applicable to most traffic scenarios. We train our RL agents using these strategies on an intersection and evaluate their generalization through numerical experiments across completely unseen traffic scenarios, including a new intersection, an on-ramp, and a roundabout. Incorporating these information-dense observations reduces training times to under one hour on a single CPU, and the evaluation results reveal that our RL agents can effectively zero-shot generalize. Code: github.com/cas-lab-munich/SigmaRL