Ship autonomous collision avoidance behavior decision based on an improved Rainbow algorithm
-
Abstract
Objectives To reduce maritime accidents caused by human error during ship navigation, a novel autonomous collision avoidance behavior decision-making method for ships based on an improved Rainbow algorithm is proposed. Methods Firstly, the long short-term memory network (LSTM) is introduced to improve the network structure, enhancing the convergence speed and generalization ability of the algorithm. Secondly, adaptive prioritized experience replay (APER) is adopted to improve the efficiency of sample utilization, further strengthening the training stability and policy superiority of the algorithm. At the same time, the ship motion mathematical model, the ship domain, the collision avoidance responsibility determination model, and the "International Regulations for Preventing Collisions at Sea" (COLREGs) are deeply integrated into the algorithm framework, constructing a set of reward functions that balance safety, compliance, and economic benefits. Finally, digital simulations and real environment simulation experiments are conducted to verify the effectiveness of this method. Results Simulation results demonstrate that the improved Rainbow algorithm achieves a 37.5% increase in convergence speed compared with the traditional algorithm during training; the convergence curve is smoother and more stable, and the average reward per round is significantly improved after convergence. The trained model can accurately identify encounter scenarios and take appropriate collision avoidance measures in accordance with COLREGs. Conclusions The improved model trained by the Rainbow algorithm enables ships to make autonomous collision avoidance decisions while complying with the COLREGs.
-
-