LIU T, WANG H, ZHENG K, et al. Probabilistic perception-based TS-DQN decision-making for autonomous USV submarine searchJ. Chinese Journal of Ship Research, 2026, 21(X): 1–14 (in Chinese). DOI: 10.19693/j.issn.1673-3185.04440
Citation: LIU T, WANG H, ZHENG K, et al. Probabilistic perception-based TS-DQN decision-making for autonomous USV submarine searchJ. Chinese Journal of Ship Research, 2026, 21(X): 1–14 (in Chinese). DOI: 10.19693/j.issn.1673-3185.04440

Probabilistic perception-based TS-DQN decision-making for autonomous USV submarine search

  • Objective This study aims to develop a deep reinforcement learning-based search algorithm for unmanned surface vehicles (USVs) in submarine detection tasks.
    Method The study is conducted in the context of submarines infiltrating key maritime areas, where a search environment and a kinematic model are constructed. A sonar detection probability model is developed, incorporating the effects of both distance and angle, with well-defined criteria for determining detection success. Based on this framework, a Markov decision process (MDP) is formulated using the deep Q-network (DQN) algorithm. The state space explicitly includes detection probability, while a multi-objective reward function is designed to integrate detection probability, distance, and angle. To enhance learning efficiency, a temporal difference DQN with probabilistic sensing (TS-DQN) algorithm is proposed, combining a double-dueling network architecture with prioritized experience replay. Additionally, a probabilistic perception-based ε-greedy exploration strategy is implemented, enabling dynamical adjustment of exploration behavior based on real-time detection states, thereby significantly improving policy learning efficiency.
    Results Extensive simulation experiments demonstrate that the proposed method achieves a detection success rate of 38.85%, which is 18 times higher than that of the second-best method, Dueling DQN. The approach also reduces the average path length to 334.36 steps, shortening the search trajectory by more than 9.5% compared to other algorithms.
    Conclusion The proposed algorithm exhibits significant advantages in detection efficiency and effectiveness, providing an innovative solution for advancing autonomous USV-based search and detection technologies.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return