ITSC 2025 Paper Abstract

Paper WE-LA-T4.1

Polley, Nikolai (Karlsruhe Institute of Technology), Boualili, Yacin (Karlsruhe Institute of Technology), Mütsch, Ferdinand (Karlsruhe Institute of Technology), Zipfl, Maximilian (FZI Research Center for Information Technology), Fleck, Tobias (FZI Research Center for Information Technology), Zöllner, J. Marius (FZI Research Center for Information Technology; KIT Karlsruhe In)

2.5D Object Detection for Intelligent Roadside Infrastructure

Scheduled for presentation during the Regular Session "S04c-Intelligent Perception and Detection Technologies for Connected Mobility" (WE-LA-T4), Wednesday, November 19, 2025, 16:00−16:20, Surfers Paradise 1

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on October 19, 2025

Keywords Real-time Object Detection and Tracking for Dynamic Traffic Environments, AI, Machine Learning for Real-time Traffic Flow Prediction and Management, Deep Learning for Scene Understanding and Semantic Segmentation in Autonomous Vehicles

Abstract

On-board sensors of autonomous vehicles can be obstructed, occluded, or limited by restricted fields of view, complicating downstream driving decisions. Intelligent roadside infrastructure perception systems, installed at elevated vantage points, can provide wide, unobstructed intersection coverage, supplying a complementary information stream to autonomous vehicles via vehicle-to-everything (V2X) communication. However, conventional 3D object-detection algorithms struggle to generalize under the domain shift introduced by top-down perspectives and steep camera angles. We introduce a 2.5D object detection framework, tailored specifically for infrastructure roadside-mounted cameras. Unlike conventional 2D or 3D object detection, we employ a prediction approach to detect ground planes of vehicles as parallelograms in the image frame. The parallelogram preserves the planar position, size, and orientation of objects while omitting their height, which is unnecessary for most downstream applications. For training, a mix of real-world and synthetically generated scenes is leveraged. We evaluate generalizability on a held-out camera viewpoint and in adverse-weather scenarios absent from the training set. Our results show high detection accuracy, strong cross-viewpoint generalization, and robustness to diverse lighting and weather conditions. Model weights and inference code are provided at: https://gitlab.kit.edu/kit/aifb/ATKS/public/digit4taf/2.5d- object-detection