IEEE IV 2024 Program | Monday June 3, 2024


MoOS Plenary Session, Landing Ballroom A	Add to My Program
Opening

Chair: Sjöberg, Jonas	Chalmers University


MoF1F1 Demo Session, Landing Ballroom C	Add to My Program
F1 Tenth/ Deep Racer


MoPKN Plenary Session, Landing Ballroom A	Add to My Program
Keynote 1: Christian Gerdes

Chair: Kong, Seung-Hyun	Korea Advanced Institute for Science and Technology
Co-Chair: Vlacic, Ljubo	Griffith University

08:30-09:30, Paper MoPKN.1	Add to My Program
Racing towards the Future of Automated Vehicles

Gerdes, J Christian	Stanford University
Keywords: Abstract: For over a century, automobile manufacturers have used the challenge of racing to better understand and improve upon vehicle design. Can the development of autonomous race cars advance the development of driver assistance systems and automated vehicles in a similar way? This talk explores our work with automated race cars at Stanford’s Dynamic Design Lab, identifying the basic challenges of racing and how control systems can handle these challenges. While automated vehicles hold significant advantages in computation and response time, head-to-head comparison with expert drivers shows humans can still teach the machine a few tricks. How, then, should we close this gap? Should we rely on our knowledge of physics to harness increasingly detailed models of the vehicle dynamics? Should we instead turn to AI to learn models directly from data and potentially eliminate the need to estimate physical parameters like friction? Or is there a path forward that can leverage the benefits of these two very different approaches? The talk at concludes with a look at some of our latest experiments, the current state of the art and open questions on the road to the future.


MoAOR Plenary Session, Landing Ballroom A	Add to My Program
Oral 1

Chair: Vlacic, Ljubo	Griffith University
Co-Chair: Sjöberg, Jonas	Chalmers University

09:30-09:45, Paper MoAOR.1	Add to My Program
Modeling the Lane-Change Reactions to Merging Vehicles for Highway On-Ramp Simulations

Holley, Dustin	GCAPS
D'sa, Jovin	Honda Research Institute, USA
Nourkhiz Mahjoub, Hossein	Honda Research Institute, US
Ali, Gibran	Virginia Tech Transportation Institute
Naes, Tyler	Honda Research Institute, USA
Moradi-Pari, Ehsan	Honda Research Institute USA
Kallepalli, Pawan Sai	GCAPS
Keywords: Simulation and Real-World Testing Methodologies, Automated Vehicles, Human Factors for Intelligent Vehicles Abstract: Enhancing simulation environments to replicate real-world driver behavior is essential for developing Autonomous Vehicle technology. While some previous works have studied the yielding reaction of lag vehicles in response to a merging car at highway on-ramps, the possible lane-change reaction of the lag car has not been widely studied. In this work we aim to improve the simulation of the highway merge scenario by including the lane-change reaction in addition to yielding behavior of main-lane lag vehicles, and we evaluate two different models for their ability to capture this reactive lane-change behavior. To tune the payoff functions of these models, a novel naturalistic dataset was collected on U.S. highways that provided several hours of merge-specific data to learn the lane change behavior of U.S. drivers. To make sure that we are collecting a representative set of different U.S. highway geometries in our data, we surveyed 50,000 U.S. highway on-ramps and then selected eight representative sites. The data were collected using roadside-mounted lidar sensors to capture various merge driver interactions. The models were demonstrated to be configurable for both keep-straight and lane-change behavior. The models were finally integrated into a high-fidelity simulation environment and confirmed to have adequate computation time efficiency for use in large-scale simulations to support autonomous vehicle development.

09:45-10:00, Paper MoAOR.2	Add to My Program
Simulating Road Spray Effects in Automotive Lidar Sensor Models

Scheuble, Dominik	Mercedes-Benz AG
Linnhoff, Clemens	Persival GmbH
Bijelic, Mario	Princeton University
Elster, Lukas	Technical University Darmstadt
Rosenberger, Philipp	Persival GmbH
Ritter, Werner	Mercedes-Benz AG
Winner, Hermann	Technische Universität Darmstadt
Keywords: Advanced Driver Assistance Systems (ADAS), Automated Vehicles, Sensor Signal Processing Abstract: Although lidar sensors have emerged as a cornerstone sensing modality in autonomous driving, they face significant challenges in adverse weather conditions. A particular detrimental effect is spray — a phenomenon where water particles are whirled up by vehicles driving with high velocities on wet roads. Spray often causes clutter points in lidar data to be falsely classified as vehicles by downstream object detectors. In this work, a phenomenological spray simulation model, suitable as an augmentation method for object detection algorithms, is presented. Two distinct datasets featuring real-world spray scenarios are recorded and analyzed, with the first serving for calibrating the simulation model through extensive experiments that vary vehicle speeds, types, and pavement wetness levels. The second dataset functions as a spray test set to evaluate the effectiveness of the simulation model in the context of object detection. Employing the simulation model as an augmentation tool reveals an improvement of up to 17% in Average Precision for state-of-the-art object detection methods in real spray conditions.

10:00-10:15, Paper MoAOR.3	Add to My Program
Examining Trust's Influence on Autonomous Vehicle Perceptions

Tang, Liang	University of Illinois
Bashir, Masooda	University of Illinois at Urbana Champaign
Keywords: Human Factors for Intelligent Vehicles, Policy, Ethics, and Regulations Abstract: The advent of autonomous vehicles (AVs) represents a transformative shift in transportation, promising to redefine mobility and alter our interaction with vehicles. Understanding public perceptions of AVs is crucial, as it influences the adoption and integration of this technology into society. This research conducts a comprehensive investigation into the factors that influence people’s attitudes toward AVs, examining the associated benefits and concerns, as well as the extent of trust placed in this emerging technology. The primary objective is to gain a deeper understanding of the elements that contribute to human-machine trust in the context of AVs. The findings reveal a consistent pattern in the propensity to trust AVs and concerns regarding performance failures, both at individual and societal levels. From a societal perspective, enhanced locomotion independence is the primary benefit of AV deployment, contributing to increased accessibility and reduced reliance on conventional transportation systems. At the individual level, increased free time emerges as the foremost advantage. These findings provide AV developers and policymakers the critical insight when deploying autonomous vehicle systems.

10:15-10:30, Paper MoAOR.4	Add to My Program
Predicting and Analyzing Pedestrian Crossing Behavior at Unsignalized Crossings

Zhang, Chi	University of Gothenburg
Sprenger, Janis	German Research Center for Artificial Intelligence (DFKI)
Ni, Zhongjun	Linköping University
Berger, Christian	Chalmers \| University of Gothenburg
Keywords: Pedestrian Protection, Human Factors for Intelligent Vehicles, Automated Vehicles Abstract: Understanding and predicting pedestrian crossing behavior is essential for enhancing automated driving and improving driving safety. Predicting gap selection behavior and the use of zebra crossing enables driving systems to proactively respond and prevent potential conflicts. This task is particularly challenging at unsignalized crossings due to the ambiguous right of way, requiring pedestrians to constantly interact with vehicles and other pedestrians. This study addresses these challenges by utilizing simulator data to investigate scenarios involving multiple vehicles and pedestrians. We propose and evaluate machine learning models to predict gap selection in non-zebra scenarios and zebra crossing usage in zebra scenarios. We investigate and discuss how pedestrians' behaviors are influenced by various factors, including pedestrian waiting time, walking speed, the number of unused gaps, the largest missed gap, and the influence of other pedestrians. This research contributes to the evolution of intelligent vehicles by providing predictive models and valuable insights into pedestrian crossing behavior.


MoAM_BR Coffee Break, Foyer	Add to My Program
Coffee 1


MoLuLu Lunch break, Foyer	Add to My Program
Lunch


MoBOR Plenary Session, Landing Ballroom A	Add to My Program
Oral 2

Chair: Nashashibi, Fawzi	INRIA
Co-Chair: Hu, Jia	Tongji University

14:10-14:25, Paper MoBOR.1	Add to My Program
Which Framework Is Suitable for Online 3D Multi-Object Tracking for Autonomous Driving with Automotive 4D Imaging Radar?

Liu, Jianan	Vitalent Consulting
Ding, Guanhua	Beihang University
Xia, Yuxuan	Linkoping University
Sun, Jinping	Beihang University
Huang, Tao	James Cook University
Xie, Lihua	Nanyang Technological University
Zhu, Bing	Beihang University
Keywords: Sensor Signal Processing, Perception Including Object Event Detection and Response (OEDR), Sensor Fusion for Localization Abstract: Online 3D multi-object tracking (MOT) has recently received significant research interests due to the expanding demand of 3D perception in advanced driver assistance systems (ADAS) and autonomous driving (AD). Among the existing 3D MOT frameworks for ADAS and AD, conventional point object tracking (POT) framework using the tracking-by-detection (TBD) strategy has been well studied and accepted for LiDAR and 4D imaging radar point clouds. In contrast, extended object tracking (EOT), another important framework which accepts the joint-detection-and-tracking (JDT) strategy, has rarely been explored for online 3D MOT applications. This paper provides the first systematical investigation of the EOT framework for online 3D MOT in real-world ADAS and AD scenarios. Specifically, the widely accepted TBD-POT framework, the recently investigated JDT-EOT framework, and our proposed TBD-EOT framework are compared via extensive evaluations on two open source 4D imaging radar datasets: View-of-Delft and TJ4DRadSet. Experiment results demonstrate that the conventional TBD-POT framework remains preferable for online 3D MOT with high tracking performance and low computational complexity, while the proposed TBD-EOT framework has the potential to outperform it in certain situations. However, the results also show that the JDT-EOT framework encounters multiple problems and performs inadequately in evaluation scenarios. After analyzing the causes of these phenomena based on various evaluation metrics and visualizations, we provide possible guidelines to improve the performance of these MOT frameworks on real-world data. These provide the first benchmark and important insights for the future development of 4D imaging radar-based online 3D MOT.

14:25-14:40, Paper MoBOR.2	Add to My Program
Determining the Tactical Challenge of Scenarios to Efficiently Test Automated Driving Systems

Vater, Lennart	RWTH Aachen University
Tarlowski, Sven	RWTH Aachen University
Schuldes, Michael	RWTH Aachen University
Eckstein, Lutz	RWTH Aachen University
Keywords: Verification and Validation Techniques, Simulation and Real-World Testing Methodologies, Automated Vehicles Abstract: The selection of relevant test scenarios for the scenario-based testing and safety validation of automated driving systems (ADSs) remains challenging. An important aspect of the relevance of a scenario is the challenge it poses for an ADS. Existing methods for calculating the challenge of a scenario aim to express the challenge in terms of a metric value. Metric values are useful to select the least or most challenging scenario. However, they fail to provide human-interpretable information on the cause of the challenge which is critical information for the efficient selection of relevant test scenarios. Therefore, this paper presents the Challenge Description Method that mitigates this issue by analyzing scenarios and providing a description of their challenge in terms of the minimum required lane changes and their difficulty. Applying the method to different highway scenarios showed that it is capable of analyzing complex scenarios and providing easy-to-understand descriptions that can be used to select relevant test scenarios.

14:40-14:55, Paper MoBOR.3	Add to My Program
Offline Tracking with Object Permanence

Liu, Xianzhong	Delft University of Technology
Caesar, Holger	TU Delft
Keywords: Integration of HD map and Onboard Sensors, Sensor Signal Processing Abstract: To reduce the expensive labor costs of manually labeling autonomous driving datasets, an alternative is to automatically label the datasets using an offline perception system. However, objects might be temporarily occluded. Such occlusion scenarios in the datasets are common yet underexplored in offline auto labeling. In this work, we propose an offline tracking model that focuses on occluded object tracks. It leverages the concept of object permanence, which means objects continue to exist even if they are not observed anymore. The model contains three parts: a standard online tracker, a re-identification (ReID) module that associates tracklets before and after occlusion, and a track completion module that completes the fragmented tracks. The Re-ID module and the track completion module use the vectorized lane map as a prior to refine the tracking results with occlusion. The model can effectively recover the occluded object trajectories. It significantly improves the original online tracking result, demonstrating its potential to be applied in offline auto labeling as a useful plugin to improve tracking by recovering occlusions.

14:55-15:10, Paper MoBOR.4	Add to My Program
SF3D: SlowFast Temporal 3D Object Detection

Wang, Renhao	UC Berkeley
Yu, Zhiding	NVIDIA
Lan, Shiyi	NVIDIA
Xie, Enze	The University of Hong Kong
Chen, Ke	Nvidia
Anandkumar, Animashree	California Institute of Technology
Alvarez, José M.	NVIDIA
Keywords: Perception Including Object Event Detection and Response (OEDR), Sensor Fusion for Localization, Sensor Signal Processing Abstract: Leveraging inputs over multiple consecutive frames has been shown to benefit 3D object detection. However, existing approaches often demonstrate unsatisfactory scaling with increasing temporal histories. In this work, we propose SF3D, a late fusion module which addresses this issue by better modeling temporal relationships via a two-stream factorization. Concretely, SF3D operates on an input sequence of consecutive bird's-eye view (BEV) features, which is partitioned into ``short-term'' and ``long-term'' frames. A more heavily parameterized short-term branch using adapters and deformable attention aggregates features closer to the current timestep. In parallel, a long-term branch composed of efficiently implemented global convolution layers aggregates a larger window of temporally distant historical features. This two-stream paradigm allows SF3D to effectively consume near-term information, while scaling to efficiently leverage longer historical windows. We show that SF3D works with arbitrary upstream BEV encoders and downstream detectors, achieving improvements over recent state-of-the-art on the Waymo Open and nuScenes benchmarks.

15:10-15:25, Paper MoBOR.5	Add to My Program
Real-Time 3D Semantic Occupancy Prediction for Autonomous Vehicles Using Memory-Efficient Sparse Convolution

Sze, Samuel Tian Hong	University of Oxford
Kunze, Lars	University of Oxford
Keywords: Perception Including Object Event Detection and Response (OEDR), End-To-End (E2E) Autonomous Driving Abstract: In autonomous vehicles, understanding the surrounding 3D environment of the ego vehicle in real-time is essential. A compact way to represent scenes while encoding geometric distances and semantic object information is via 3D semantic occupancy maps. State of the art 3D mapping methods leverage transformers with cross-attention mechanisms to elevate 2D vision-centric camera features into the 3D domain. However, these methods encounter significant challenges in real-time applications due to their high computational demands during inference. This limitation is particularly problematic in autonomous vehicles, where GPU resources must be shared with other tasks such as localization and planning. In this paper, we introduce an approach that extracts features from front-view 2D camera images and LiDAR scans, then employs a sparse convolution network (Minkowski Engine), for 3D semantic occupancy prediction. Given that outdoor scenes in autonomous driving scenarios are inherently sparse, the utilization of sparse convolution is particularly apt. By jointly solving the problems of 3D scene completion of sparse scenes and 3D semantic segmentation, we provide a more efficient learning framework suitable for real-time applications in autonomous vehicles. We also demonstrate competitive accuracy on the nuScenes dataset.


MoPM_BR Coffee Break, Foyer	Add to My Program
Coffee 2

Technical Program for Monday June 3, 2024