Unconventional reservoirs like shale oil/gas are expected to play a major role in many unexplored regions, globally. Shale resource evaluation involves the estimation of Total Organic Carbon (TOC) which correlates to the prospective capability of generating and containing hydrocarbons. Direct measurement of TOC through geochemical analysis is often not feasible, and hence researchers have focused on indirect methods to estimate TOC using analytical and statistical techniques. Accordingly, this work proposes the application of artificial intelligence (AI) techniques to leverage routinely available well logs for the prediction of TOC. Multiple algorithms are developed and compared to rank the most optimum solution based on efficiency analysis.
Support Vector Regression (SVR), Random Forest (RF), and XGBoost algorithms are utilized to analyze the well-log data and develop intelligent models for shale TOC. A process-based approach is followed starting with systematic data analysis, which includes the selection of the most relevant input parameters, data cleaning, filtering, and data-dressing, to ensure optimized inputs into the AI models. The data utilized in this work is from major shale basins in Asia and North America. The AI models are then used to develop TOC predictor as a function of fundamental open-hole logs including sonic, gamma-ray, resistivity, and density. Furthermore, to strengthen AI input-output correlation mapping, a k-fold cross-validation methodology integrating with the exhaustive-grid search approach is adopted. This ensures the optimized hyperparameters of the intelligent algorithms developed in this work are selected. Finally, developed models are compared to geochemically derived TOC using a comprehensive error analysis schema.
The proposed models are teted for veracity by applying them on blind dataset. An error metrics schema composed of root-mean-squared-error, and coefficient of determination, is developed. This analysis ranks the respective AI models based on the highest performance efficiency and lowest prediction error. Consequently, it is concluded that the XGBoost and SVR-based TOC predictions are inaccurate yielding high deviations from the actual measured values in predictive mode. On the other hand, Random Forest TOC predictor optimized using k-fold validation produces high R2 values of more than 0.85 and reasonably low errors when compared to true values. The RF method overpowers other models by mapping complex non-linear interactions between TOC and various well logs.