ITSC 2025 Paper Abstract

Paper FR-LM-T31.2

Ding, Haonan (Southeast University), Zhao, Mingzhuo (Southeast University), Zhang, Sunan (Southeast University), Wang, Weihua (Southeast University), liang, jinhao (Southeast University), Dong, Haoxuan (National University of Singapore), Zhuang, Weichao (Southeast University), Yin, Guodong (Southeast University)

Stability-Guided Safe Reinforcement Learning: A Lyapunov-Based Approach for Autonomous Vehicle Control

Scheduled for presentation during the Regular Session "S31a-AI-Driven Motion Prediction and Safe Control for Autonomous Systems" (FR-LM-T31), Friday, November 21, 2025, 10:50−11:10, Southport 1

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on October 18, 2025

Keywords Autonomous Vehicle Safety and Performance Testing, Real-time Motion Planning and Control for Autonomous Vehicles in ITS Networks, Safety Verification and Validation Methods for Autonomous Vehicle Technologies

Abstract

The stability of policy outputs is a key challenge faced by reinforcement learning (RL), as it directly impacts the safety of actions taken by autonomous systems. In this paper, we propose a Lyapunov-based RL method to address safety concerns in the decision-making and control processes of autonomous vehicles. Specifically, we introduce Lyapunov networks within the Soft Actor-Critic (SAC) architecture to assess the safety performance of the Actor network, with a particular focus on the stability of policy outputs. Building on a stability theorem, we demonstrate that stability conditions can serve as critical constraints in RL to guide the learning of a safe controller/strategy. The approach leverages the Critic’s encouragement of positive rewards to promote exploration, while Lyapunov constraints are applied to limit unsafe behaviors. In a simulated highway autonomous driving environment, experimental results demonstrate that the proposed algorithm maintains robust stability under predefined safety constraints, achieving performance metrics that exceed existing industry benchmarks while ensuring compliance with operational safety requirements.