A random forests Rate Of Penetration (ROP) model, along with heat maps, was used to challenge and optimize the drilling parameters for new wells based on the surface drilling data acquired from previous wells. The goal was to analyze the data to observe surface drilling parameter trends aiding in increased bit life and reduced bit wear resulting in maximizing ROP and minimizing Mechanical Specific Energy (MSE).

The four key variables investigated were weight on bit (WOB), surface RPM, mud flowrate and the drilling formation. Surface drilling data for this study was utilized from wells, within a 20 mile radius, where the same bit and motor drilled the entire vertical interval to TD. Heat maps and ROP models (created using support vector regression, random forests and boosted trees) were employed for this purpose. Data was cleaned up using cutoffs (from the minimum and maximum values expected by the drilling engineer) and plotting data distributions. K-fold Cross validation was applied when generating the ROP models. The aim was to focus on the optimization of drilling parameters using surface data only, due to the lack of sub-surface data availability. Using the methodology developed, the drilling parameters could be optimized to extend bit life and reduce bit trips by maximizing ROP and minimizing MSE.

The random forests ROP model was found to be the best with a 12% mean absolute error. The error could have been reduced further by introducing additional variables into the model that capture the changes in formation mechanical properties, downhole parameters and vibrations. This paper only focuses on learnings from surface drilling data. After a certain threshold (which differed for the different formations encountered) an increase in WOB didn' t result in a corresponding increase in ROP. Moreover, most of the ROP gains were observed to be in the shallower formations drilled. For the deeper formations, it was more beneficial to reduce MSE as the ROP was relatively lower no matter what the parameters.

This study used random forests, support vector regression and boosted tree methods to generate ROP models instead of neural networks. Even though neural networks are the most extensive, random forests are generally faster and were the most accurate of the three aforementioned methods used. The less time and computational resources required when compared to neural networks made random forests an attractive option for such a study.

You can access this article if you purchase or spend a download.