HUI X, WANG N, WANG S. Deep reinforcement learning path planning for unmanned surface vehicle based on local observation[J]. Chinese Journal of Ship Research, 2025, 20(X): 1–11 (in Chinese). DOI: 10.19693/j.issn.1673-3185.04390
Citation: HUI X, WANG N, WANG S. Deep reinforcement learning path planning for unmanned surface vehicle based on local observation[J]. Chinese Journal of Ship Research, 2025, 20(X): 1–11 (in Chinese). DOI: 10.19693/j.issn.1673-3185.04390

Deep reinforcement learning path planning for unmanned surface vehicle based on local observation

  • Objectives Maritime rescue missions are critical operations that demand efficient and reliable path planning for unmanned surface vehicle (USV). However, these missions face significant challenges due to the limited sensing capabilities of USVs operating in vast, uncertain environments with randomly distributed obstacles. This research addresses the problem of low path planning efficiency and poor robustness caused by limited perception range, proposing a novel local observation-based path planning approach for USVs in maritime rescue missions.
    Methods The proposed approach integrates three key methodological innovations. First, we employ the Soft Actor-Critic (SAC) algorithm with a reward function specifically designed for local observation scenarios, rewarding efficient goal-reaching behavior while penalizing collisions with obstacles. This design helps balance exploration and exploitation in uncertain environments. Second, we introduce a Feature Enhanced Soft Actor-Critic (FESAC) algorithm that significantly improves training efficiency and model robustness. This enhancement extracts key environmental features and utilizes a randomized feature training environment with strategically placed obstacles to increase sampling efficiency. The training process randomly resets obstacle positions, USV starting points, and goal positions across episodes, forcing the model to learn generalizable navigation strategies rather than memorizing specific obstacle configurations. Third, we develop an adaptive waypoint planning algorithm based on local perception domains, which effectively coordinates local obstacle avoidance with global goal-reaching behavior. This approach dynamically selects waypoints within the USV's perception radius using a weighted objective function that balances proximity to the goal with distance from obstacles, decomposing the complex global path planning task into a series of manageable local planning problems.
    Results Comprehensive simulation experiments demonstrate the effectiveness of our approach. In feature environments with randomly distributed obstacles, the proposed method achieves a remarkable success rate of over 98%, significantly outperforming traditional approaches. When deployed in simulated maritime rescue missions spanning 1 000 m×1 000 m areas with 20-50 randomly placed obstacles, the method maintains a completion rate exceeding 93% with appropriate parameter settings. The simulation results also reveal important trade-offs between path safety and efficiency: increasing the obstacle avoidance weight w_2 produces safer but longer paths, while increasing the goal-reaching weight w_1 generates shorter paths but with higher collision risk. Depending on different task requirements, optimal metrics can be obtained through reasonable parameter adjustments. Comparative analysis shows that our FESAC algorithm converges significantly faster than standard SAC in complex environments, demonstrating superior learning efficiency.
    Conclusions The proposed local observation-based path planning method effectively addresses the challenges of limited perception in maritime rescue missions, demonstrating strong robustness and adaptability to uncertain environments. By decomposing complex global planning into manageable local tasks and enhancing feature extraction capabilities, our approach provides a practical solution for real-world USV operations where complete environmental information is unavailable. This work contributes valuable technical support for applying reinforcement learning algorithms in actual engineering scenarios.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return