Abstract:
Objective This study aims to develop a deep reinforcement learning-based search algorithm for unmanned surface vehicles (USVs) in submarine detection tasks.
Method The study is conducted in the context of submarines infiltrating key maritime areas, where a search environment and a kinematic model are constructed. A sonar detection probability model is developed, incorporating the effects of both distance and angle, with well-defined criteria for determining detection success. Based on this framework, a Markov decision process (MDP) is formulated using the deep Q-network (DQN) algorithm. The state space explicitly includes detection probability, while a multi-objective reward function is designed to integrate detection probability, distance, and angle. To enhance learning efficiency, a temporal difference DQN with probabilistic sensing (TS-DQN) algorithm is proposed, combining a double-dueling network architecture with prioritized experience replay. Additionally, a probabilistic perception-based ε-greedy exploration strategy is implemented, enabling dynamical adjustment of exploration behavior based on real-time detection states, thereby significantly improving policy learning efficiency.
Results Extensive simulation experiments demonstrate that the proposed method achieves a detection success rate of 38.85%, which is 18 times higher than that of the second-best method, Dueling DQN. The approach also reduces the average path length to 334.36 steps, shortening the search trajectory by more than 9.5% compared to other algorithms.
Conclusion The proposed algorithm exhibits significant advantages in detection efficiency and effectiveness, providing an innovative solution for advancing autonomous USV-based search and detection technologies.