ITSC 2024 Paper Abstract

Paper WeBT13.5

Han, Ye (Tongji University), Zhang, Lijun (Tongji University), Meng, Dejian (Tongji University), Hu, Xingyu (Tongji University), Lu, Yixia (Tongji University)

SPformer: A Transformer Based DRL Decision Making Method for Connected Automated Vehicles

Scheduled for presentation during the Poster Session "Transformer networks" (WeBT13), Wednesday, September 25, 2024, 14:30−16:30, Foyer

2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), September 24- 27, 2024, Edmonton, Canada

This information is tentative and subject to change. Compiled on January 8, 2026

Keywords Multi-autonomous Vehicle Studies, Models, Techniques and Simulations, Cooperative Techniques and Systems, Automated Vehicle Operation, Motion Planning, Navigation

Abstract

In mixed autonomy traffic environment, every decision made by an autonomous-driving car may have a great impact on the transportation system. Because of the complex interaction between vehicles, it is challenging to make decisions that can ensure both high traffic efficiency and safety now and futher. Connected automated vehicles (CAVs) have great potential to improve the quality of decision-making in this continuous, highly dynamic and interactive environment because of their stronger sensing and communicating ability. For multi-vehicle collaborative decision-making algorithms based on deep reinforcement learning (DRL), we need to represent the interactions between vehicles to obtain interactive features. The representation in this aspect directly affects the learning efficiency and the quality of the learned policy. To this end, we propose a CAV decision-making architecture based on transformer and reinforcement learning algorithms. A learnable policy token is used as the learning medium of the multi-vehicle joint policy, the states of all vehicles in the area of interest can be adaptively noticed in order to extract interactive features among agents. We also design an intuitive physical positional encodings, the redundant location information of which optimizes the performance of the network. Simulations show that our model can make good use of all the state information of vehicles in traffic scenario, so as to obtain high-quality driving decisions that meet efficiency and safety objectives. The comparison shows that our method significantly improves existing DRL-based multi-vehicle cooperative decision-making algorithms.