ITSC 2025 Paper Abstract

Paper WE-EA-T4.5

Liu, Chenchen (National University of Singapore), YUAN, CHENGRAN (National University of Singapore), SUN, JIAWEI (National University of Singapore), Zhang, Zhengshen (National University of Singapore), Huang, Zefan (National University of Singapore), Guo, Sheng (NUS), Ang Jr, Marcelo H (National University of Singapore), Tay, Eng Hock Francis (National University of Singapore)

OSA-YOLO: A Lightweight YOLO Framework with Omnidirectional Selective Scanning and Area-Attention for Traffic Object

Scheduled for presentation during the Regular Session "S04b-Intelligent Perception and Detection Technologies for Connected Mobility" (WE-EA-T4), Wednesday, November 19, 2025, 14:50−14:50, Surfers Paradise 1

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on October 19, 2025

Keywords Real-time Object Detection and Tracking for Dynamic Traffic Environments, Advanced Air Traffic Management Systems for Drone Integration

Abstract

Effective traffic monitoring using Unmanned Aerial Vehicles (UAVs) requires real-time object detection under strict computational and power constraints. In this work, we propose OSA-YOLO, a lightweight detection framework that enhances the YOLO architecture with an Omnidirectional Selective Scan Module (OSSM) and an Area-Attention Concatenate-to-Fuse (A2C2f) block. OSSM improves global context modeling by incorporating horizontal, vertical, diagonal, and anti-diagonal reciprocating scans, while A2C2f strengthens local feature fusion after high-level refinement. On the VisDrone dataset, OSA-YOLO outperforms the baseline method YOLOv8n with a 21.8% increase in precision, a 17.0% increase in recall, and a relative 26.2% gain in mAP@50. Even compared to the state-of-the-art Mamba YOLO, OSAYOLO delivers slightly better performance, with a relative 5.2% improvement in mAP@50 and 3.9% in mAP@50:95. For real-world deployment, the model is optimized using TensorRT and deployed on an NVIDIA Jetson Orin Nano, achieving an inference latency of 37.9 ms per frame under FP16 half-precision. Extensive ablation experiments validate the effectiveness of the proposed method for practical aerial traffic monitoring.