This paper presents the state-of-the-art of the artificial intelligence algorithm, named dual heuristic dynamic programming (DHP) that uses to solve the petroleum optimization-control problems. Fast self-learning control based on DHP is illustrated for trajectory tracking levels on a quadruple tank system (QTS), which consists of four tanks and two electrical-pumps with two pressure control valves. Two artificial neural networks are constructed the DHP approach, which are the critic network (the provider of a critique/evaluated signals) and the actor-network or controller (the provider of control signals). DHP controller is learnt without human intervention via repeating the interaction between an equipment and environment/process. In other words, the equipment receives the system states of the process via sensors, and the algorithm maximizes the reward by selecting the correct optimal action (control signal) to feed the equipment. The simulation results are shown for applying DHP with QTS as a benchmark test problem by using MATLAB. QTS is taken in the paper because QTS is widely used in the most petroleum exploration/production fields as entire system or parts. The second reason for using QTS as a test problem is QTS has a difficult model to control, which has a limited zone of operating parameters to be stable. Multi-input-multi-output (MIMO) model of QTS is a similar model with most MIMO devises in the oil and gas field. The overall learning control system performance is tested and compared with a heuristic dynamic programming (HDP) and a well-known industrial controller, which is a proportional integral derivative (PID) by using MATLAB programming. The simulation results of DHP provide enhanced performance compared with the PID approach with 98.9002 % improvement. Furthermore, DHP is faster than HDP, whereas DHP needs 6 iterations, while HDP requires 652 iterations to stabilize the system at minimum error. Because of most equipment in the oil and gas industry has programmable logic control (PLC), the neural network block has already existed in the toolbox of the PLC program. Therefore, this project can apply in real by installing PLC to any equipment with DHP toolbox that connects to the sensors and actuators. At the first time, the DHP toolbox in PLC is learnt by itself to build a suitable robust controller. Then, the DHP controller is used during normal situations, while if any hard events happen to the equipment (the PID controller cannot handle it), the DHP toolbox starts learning from scratch again to overcome the new situations.