The conventional control algorithm for the dynamic positioning system(DPS) is a PD system. However, it uses constant gains, which are inefficient in time-varying environments. Aimed at the lack of selftuning PD parameters in the conventional PD, an adaptive PD based on the Deep Deterministic Policy Gradient (DDPG) which is one of Reinforcement Learning(RL) algorithms is proposed. The advantage of using DDPG is that it does not require any prior knowledge about dynamics of a ship and environmental disturbances. Instead, the DDPG algorithms may learn them on their own by interacting with environments given. Finally, it was found that the DDPG was successfully carried out learning an efficient gain-control strategy for an adaptive PD and its efficiency is presented by comparison with the conventional PD.
As the demand for Drillships and FPSO has been increasing to produce resources in deep sea, Drillships and FPSO equipped with Dynamic Positioning System(DPS) have been built increasingly. DPS is a system to maintain a vessel's position and heading angle. It is powered by azimuth thrusters or side thrusters or a combination of them and usually uses PD(Proportional, Derivative). The reason why a PD system is usually adopted for DPS is because of its versatility, typical structure, high reliability and ease of operation. Moreover, a PD control has a prominent advantage in that we can obtain a definite control performance by choosing appropriate PD gains based on experiences. Therefore, PD controllers are widely used DPS. But PD gains cannot be tuned real-time to adapt to the changes of the system/environment once they are settled. It is difficult to obtain satisfactory control when we apply PD control to time-varying or time-lag environments such as an environment with the wind, wave, current. In order to solve this problem, the adaptive PD controller design has received wide attention. The common design idea of the adaptive PD controller is to adjust PD gains to a varying system to obtain better control. There are many kinds of adaptive PID/PD control methods that have been proposed such as fuzzy adaptive PID control proposed by He and Tan and Xu and Wang (1993), adaptive PID control based on genetic algorithm proposed by Salami and Cain (1995). Fuzzy adaptive PID control design needs much prior knowledge and has a problem of parameter optimization. Although the adaptive PID control based on the genetic algorithm needs little prior knowledge, it is unable to realize real-time optimization because of its slow computing speed.