This work presents the modeling and development of a methodology based on Model Predictive Control – MPC that uses a machine learning model, based on Reinforcement Learning, as the method for searching the optimal control policy, and a neural network as a proxy, for modeling the nonlinear plant. The neural network model was developed to predict the following variables: average pressure of the reservoir, the daily production of oil, gas, water and water cut in the production well, for three consecutive values, to perform the predictive control. This model is applied as a strategy to control the oil production in an oil reservoir with existing producer and injector wells. The experiments were carried out on a synthetic oil reservoir model that consists in a reservoir with three layers with different permeability and one producer well and one injector well, both completed in the three layers. There are three valves located into the injector well, one for each completion, which are the handling variables of the model. The oil production of the producer well is the controlled variable. The experiments performed have considered various set points and also the impact of disturbances on the production well. The obtained results indicate that the proposed model is capable of controlling oil production even with disturbances in the producing well, for different reference values for oil production and supporting some features of the petroleum reservoir systems such as: strong non- linearity, long delay in the system response, and multivariate characteristic.