Objectives The existing combat deduction simulation system mainly implements decision-making based on operational rules and experience knowledge, and it has certain problems such as limited application scenarios, low decision-making efficiency and poor flexibility. In view of the shortcomings of conventional decision-making methods, an intelligent decision-making model based on deep reinforcement learning (DRL) technology is proposed.
Methods First, the maximum entropy Markov decision process(MDP) of simulation deduction is established, and then the agent training network is constructed on the basis of actor-critic architecture to generate randomization policies that improve the agent's exploration ability. At the same time, the soft policy iterative updating method is used to search for better policies and continuously improve the agent's decision-making level. Finally, the simulation is carried out on the Mozi AI platform to validate the model.
Results The results show that an agent trained with the improved soft actor-critic (SAC) decision-making algorithm can achieve autonomous decision-making. Compared with the deep deterministic policy gradient (DDPG) algorithm, the probability of winning is increased by 24.53%.
Conclusions The design scheme of this decision-making model can provide theoretical references for research on intelligent decision-making technology, giving it some reference significance for warfare simulation and deduction.