ITSC 2025 Paper Abstract

Paper FR-LM-T41.6

Khelil, Sarah Imene (ENSTA Paris,Institut Polytechnique de Paris), taourarti, Imane (ENSTA Paris / Institut Polytechnique de Paris), Monsuez, Bruno (Ecole Nationale Supérieure des Techniques Avancées), TAPUS, Adriana (ENSTA ParisTech), Geoffriault, Maud (Renault Group), Ibanez Guzman, Javier (Renault S.A.S,)

A Comparison between Behavioral Cloning and Inverse Reinforcement Learning Allocation of Tire Forces for an Optimal Yaw Moment Control

Scheduled for presentation during the Regular Session "S41a-Motion Planning, Trajectory Optimization, and Control for Autonomous Vehicles" (FR-LM-T41), Friday, November 21, 2025, 12:10−12:30, Broadbeach 1&2

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on October 18, 2025

Keywords Real-time Motion Planning and Control for Autonomous Vehicles in ITS Networks, Energy-efficient Motion Control for Autonomous Vehicles

Abstract

Modern over-actuated vehicle systems depend on precise force coordination to achieve optimal yaw moment control, critical for vehicle stability, safety, and handling. While traditional optimization-based control allocation (CA) methods are effective, they become computationally demanding as actuator complexity grows. This work explores imitation learning as a scalable alternative.

We present a comparative study between Behavioral Cloning (BC) and Maximum Entropy Inverse Reinforcement Learning (MaxEnt IRL) for neural network-based CA in over-actuated automotive systems. Both approaches are trained using real-world data from a Renault Austral prototype to imitate an optimization-based tire force allocator for yaw control. LSTM architectures are used to capture temporal dependencies. The methods are evaluated across generalization, safety, and computational performance. BC demonstrates low inference latency and strong nominal performance, while IRL achieves similar outcomes, even with reduced training coverage. Under actuator failure, both methods exhibit comparable behavior, consistent with training data characteristics.

These findings suggest that imitation learning could be explored as an alternative to traditional optimization-based control allocation in future high-actuation systems, particularly where computational efficiency and scalability become a concern. However, for the present case, optimization based allocation remains the most reliable and well-performing solution. This study serves as a foundational step toward imitating computationally demanding high-level Model Predictive Control (MPC) strategies using neural nets, enabling safe and efficient deployment in real-time automotive environments.