With the increasing demand for autonomous navigation of ships, ship trajectory tracking has gained significant attention. This paper aims to use deep reinforcement learning (DRL) to select adaptive Proportional Integral Derivative (PID) parameters and develop a trajectory tracking system for an underactuated Unmanned Surface Vessel (USV). Based on the line-of-sight algorithm and control requirements of ships, the ship trajectory tracking control problem is modeled as a Markov decision process, designing its state space, action space, and reward function, which rewards the agent based on how well it reaches the desired position along the planned trajectory. The DRL agent utilizes the Proximal Policy Optimization (PPO) algorithm to update it based on the feedback received from the environment, through interactions with the environment, the agent learns to optimal actions that lead to successful trajectory tracking. The results demonstrate that the deep reinforcement learning method can robustly complete the trajectory tracking task.
The maritime transport industry plays a crucial role in the development of the global economy, however, occurrences of ship accidents bring severe risks to both society and the environment. Ongoing research endeavors in academia have consistently centered on the dual objectives of enhancing maritime traffic safety and optimizing shipping efficiency. Concurrently, the emergence of autonomous ships has been identified as a promising avenue for future advancements in maritime safety. Traditional ship trajectory tracking methods are primarily implemented through classical navigation and control techniques. Common approaches include linear control methods, sliding mode control (SMC), model predictive control (MPC), etc. Linear control methods, such as PID controllers, adjust the rudder angle and propulsion force to achieve robust control of the system's states. The SMC introduces a sliding surface to achieve control of the system. The MPC optimizes control by considering the system's dynamic characteristics and constraints to achieve trajectory tracking. However, traditional methods often require accurate dynamic models and intricate parameter adjustments, which may be challenging to fully realize in practical maritime transportation, especially in dynamically changing sea conditions.