In this paper, several supervised machine learning algorithms have been used to develop the model for rate of penetration prediction. To train the models, real-time drilling parameters and geological log data from 3 distinct wells in the South Caspian basin are used. The different machine learning techniques, such as linear and non-linear machine learning and deep artificial neural networks, trained the well data. The evaluation metric for training is Root Mean Square Error, however the performances of the regressions are evaluated on the data using R-squared for their comparison.

Rate of penetration, or simply ROP, is the speed of the drill bit penetrating into the formation. Overall, it indicates at which rate the borehole deepens. Its value depends on the drilling parameters, such as weight on bit, applied torque, mud flow rate, rotation per minute and others. In addition, the mechanical strength of the rock formation also plays a great role, and well log data is used to assume this value for each point. That is why these features in the training datasets have high vulnerability.

Comparing various techniques, Random Forest gives us the most optimal model in terms of accuracy and computational power. The average R-squared for Random Forest is 0.90. Although RNN and LSTM models can give nearly the same fit for given test data, it takes considerably much more time to train the models due to their complexity and show relatively lower accuracy on test data, therefore it is not a reasonable choice. Furthermore, another deep learning model is deployed to generate well logs for the following sections which supports optimizing ROP and drilling performance.

You can access this article if you purchase or spend a download.