A fundamental component of a real-time drilling analytics system is automatic rig state detection. High frequency time series data (typically one data point per second) from multiple sensors on a drilling rig is processed and labeled with drilling states including: slide drilling, rotate drilling, pick up, in slips, and others. With labeled time-series data, the real-time system can derive operational KPIs (key performance indicators) with extremely high resolution, e.g., a statistical summary of rotary versus slide drilling time for the rig supervisor and drilling engineer to analyze efficiency. Later, such information can be leveraged to develop algorithms to detect abnormal drilling events and drive closed loop control.
A workflow was developed to clean and fill in any missing data. A rules-based model was then applied to classify the data into seventeen rig states. For the state “drilling”, a sub-classification was made to label rotate drilling and slide drilling. However, it is difficult to categorize “slide drilling” solely based on surface RPM due to top drive oscillation. In order to achieve acceptable accuracies, three machine learning models to classify “rotate drilling” and “slide drilling” were evaluated: Random Forest, Convolutional Neural Network (CNN), and a hybrid Convolutional Neural Network / Recurrent Neural Network (CNN/RNN).
Machine learning models were built for two basins, one model each, to accommodate different drilling styles. For the Delaware Basin, 10 wells with 9 million rows of data were chosen, and for the DJ Basin, 12 wells with 2 million rows of data were chosen. A legacy, rules-based algorithm was applied to label each row as rotate or slide drilling, and the misclassified records were manually corrected. The machine learning models were found to be far superior to rules-based models. For the wells in the training set, the accuracies of our rules-based models were 70% and 90% respectively, while the accuracies of our machine learning models were over 99%. The CNN model was proven to be the best model, excelling with high accuracy, short computation time, and scalability for big data applications.
The data cleaning, preprocessing, and machine learning algorithm has been deployed in Anadarko's Real-Time Drilling (RTD) ecosystem (Cao et al., 2018, 2019), which consists of four layers: a data source, analytics, data storage, and UI layer. KPIs, directional statistics, and engineering models are calculated in real-time and visualized through a web-based UI. This system can be accessed by any member of the drilling operations team. The system is regularly used to evaluate, compare, and optimize well performance. Future plans include pushing analytical models to the rig site with edge computing to facilitate drilling guidance and levels of automation. To our knowledge, this is the first time that a deep learning model has been used to analyze drilling time series data in a production real-time system.