ITSC 2025 Paper Abstract

Paper VP-VP.121

Xi, Haoyang (Beijing Institute of Technology), Xie, Shanshan (Beijing Institute of Technology), Yang, Yi (Beijing Institute of Technology)

Continuous Drifting Control Based on Maximal Safety Probability Learning with TD3 and Lagrangian

Scheduled for presentation during the Video Session "On-Demand Video Presentations" (VP-VP), Saturday, November 22, 2025, 08:00−18:00, On-Demand Platform

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on April 2, 2026

Keywords Smart Traffic Control using AI and Augmented Reality for Navigation and Vehicle Control, Deep Learning for Scene Understanding and Semantic Segmentation in Autonomous Vehicles, Autonomous Vehicle Safety and Performance Testing

Abstract

Extreme driving such as high-speed drifting demands precise yet verifiably safe control, presenting significant challenges for autonomous systems. Current reinforcement learning approaches typically demand complex reward engineering, struggle with safety constraints, and rely on discrete action spaces with limited precision. We propose TD3-Lagrangian-PIRL (Physics-Informed Reinforcement Learning), a framework extending maximal safety probability learning to continuous action spaces for drift control. The proposed approach introduces two key innovations: first, embedding partial differential equation constraints into the critic network of a Twin Delayed Deep Deterministic Policy Gradient (TD3) architecture, improving tracking precision and cutting training episodes by 67% relative to the discrete-action baseline. Second, implementing Lagrange multiplier to directly constrain the actor network, creating dual-layer safety constraints that further reduce training episodes by 30% while improving safety performance. Experiments in CARLA demonstrate effective drifting control with consistent generalization to high-speed cornering scenarios. This method requires only binary sparse rewards targeting safety probability, eliminating complex reward shaping or predefined trajectories. The results indicate that TD3-Lagrangian-PIRL is a viable solution for safety-critical control at the handling limits.