ITSC 2024 Paper Abstract

Close

Paper FrBT3.1

Abouelnaga, Mohamed (Chemnitz University of Technology), Haberjahn, Mathias (Infineon Technologies Dresden GmbH & Co. KG), Markert, Daniel (TU Chemnitz), Masrur, Alejandro (Technische Universität Chemnitz)

Hardware-Compatible Deep Reinforcement Learning-Based Lateral Trajectory Controller

Scheduled for presentation during the Regular Session "Autonomous driving" (FrBT3), Friday, September 27, 2024, 13:30−13:50, Salon 6

2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), September 24- 27, 2024, Edmonton, Canada

This information is tentative and subject to change. Compiled on October 3, 2024

Keywords Automated Vehicle Operation, Motion Planning, Navigation

Abstract

In this paper, we exploit the generalization capabilities of Deep Reinforcement Learning (DRL) to enhance adaptive steering control of autonomous vehicles.

We investigate three different approaches to tackle the problem of lateral control of a vehicle. At first, we use a pure DRL algorithm such as Twin Delayed Deep Deterministic Policy Gradient (TD3) or Soft Actor-Critic (SAC) as the sole controller. Although this approach adds generalization to new roads, relying only on RL lacks the stability found in classical controllers.

Therefore, we combine a Proportional-Integral-Derivative (PID) controller with RL in two different setups. The first setup uses the RL algorithm to compensate for the error resulting from the PID; the PID also regularizes the RL search space to achieve more stability. The second setup would rely upon PID as the only controller; however, we use RL as an adaptive and online selection mechanism for the PID gains. These approaches must comply with hard runtime and memory consumption constraints as we deploy the RL actor on the Parallel Processing Unit (PPU) of Infineon's AURIX(TM) TC4x microcontroller. We also follow a model-based development approach, which eases the whole process.

We show that combining DRL with a classical controller adds adaptivity to new complex roads and different vehicle dynamics; also, it can safely reach higher speeds (more than 120 km/h), although our approaches are trained on 80 km/h as a maximum speed. The RL actor consumes less than 68% of the PPU's memory space. The execution time of the RL actor for one control cycle is approximately 0.2% of the maximum acceptable execution time.

 

 

All Content © PaperCept, Inc.


This site is protected by copyright and trademark laws under US and International law.
All rights reserved. © 2002-2024 PaperCept, Inc.
Page generated 2024-10-03  03:39:08 PST  Terms of use