ITSC 2025 Paper Abstract

Paper FR-LM-T37.1

Jain, Sandesh (Virginia Tech), Han, Heesang (Virginia Tech), Abbott, Amos (Virginia Tech), Sarkar, Abhijit (Virginia Tech)

Sparse-FS3D: A Sparse-Feature Fusion Approach for Diffusion-Enhanced Few-Shot 3D Object Detection in Outdoor Scenes

Scheduled for presentation during the Regular Session "S37a-Reliable Perception and Robust Sensing for Intelligent Vehicles" (FR-LM-T37), Friday, November 21, 2025, 10:30−10:50, Coolangata 1

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on October 18, 2025

Keywords Lidar-based Mapping and Environmental Perception for ITS Applications, Real-time Object Detection and Tracking for Dynamic Traffic Environments

Abstract

Few-shot 3D object detection (FS3D) has gained increasing attention due to its potential to handle limited annotated data for 3D perception tasks. However, existing methods have predominantly focused on indoor environments with dense point clouds, leaving outdoor settings with sparse and large-scale point clouds largely underexplored. In this paper, we introduce Sparse-FS3D, a novel approach to few-shot 3D object detection tailored for outdoor LiDAR scenes. Our method addresses the challenges of sparse point clouds through a combination of adaptive voxelization, Krylov subspace-based farthest point sampling (FPS), and cross-attention feature fusion. By applying finer voxel sizes to dense regions and coarser ones to sparse regions, we optimize the computational load while maintaining critical features. Furthermore, we introduce a diffusion-based generative augmentation technique that produces realistic object point clouds for novel classes to offset limited sample availability in the few-shot split. We establish a new benchmark on the KITTI and NuScenes datasets for outdoor FS3D, demonstrating that our approach significantly improves mAP performance in challenging real-world settings with fewer labeled samples. Our results indicate that the combination of adaptive voxelization, specialized generative sampling integration via diffusion-based augmentation, provides an effective framework for few-shot learning and scalable outdoor 3D object detection, bridging the gap between sparse data and high-performance detection in autonomous driving and other large-scale 3D perception tasks. Our code is publicly available at https://github.com/sandeshrjain/sparse-fs3d.