ITSC 2025 Paper Abstract

Paper TH-EA-T25.6

LI, Jiangpeng (Nanyang Technological University), Jia, Chengfeng (Nanyang Technological University), Lu, Yun (Nanyang Technological University), Huang, Lingying (Southeast University), Su, Rong (Nanyang Technological University)

Deep Reinforcement Learning-Based Hierarchical Framework for Collaborative Multi-UUV Underwater Target Search

Scheduled for presentation during the Regular Session "S25b-Cooperative and Connected Autonomous Systems" (TH-EA-T25), Thursday, November 20, 2025, 14:50−15:30, Cooleangata 4

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on October 18, 2025

Keywords Autonomous Vessel Navigation Systems, Real-time Monitoring and Control of Waterborne Transport Systems, Cooperative Driving Systems and Vehicle Coordination in Multi-vehicle Scenarios

Abstract

In this paper, a hierarchical framework is proposed to address the underwater target search problem using multiple collaborative unmanned underwater vehicles (UUVs). The upper layer focuses on task allocation, where a computationally efficient sparse Gaussian process (SGP) model with variationally optimized inducing points is employed to update a probabilistic map of the environment by fusing inaccurate prior knowledge with real-time partial observations. An interest point selection strategy is introduced to balance exploration and exploitation during the task allocation process. Subsequently, an auction-based mechanism is developed to address task allocation while accounting for acoustic signal interference. In the lower layer, a deep reinforcement learning (DRL)-based controller with two specifically designed training stabilizers is utilized for the motion control of each underactuated UUV with a nonlinear dynamic model under stochastic current disturbances. Experimental results demonstrate the effectiveness and efficiency of the proposed task allocation module in dynamically refining the probabilistic belief of the environment and orchestrating the collaborative underwater target search. Furthermore, the DRL-based motion control module with two training stabilizers is shown to be efficient and robust under stochastic disturbances when compared to several traditional control methods.