With the coming of increasingly large databases, the growing amount of computational resources and latest algorithmic advancements, data driven and machine learning techniques are considered as potential game changers in traditional Oil and Gas industry. Unconventional oil and gas formations, including basin central gas/oil, shale gas/oil, tight gas/oil, and coalbed methane formations, are abundant, which have become an increasingly important part of global energy supply and attracted increasing attention from the industry. In the development of unconventional hydrocarbon exploration, the high well placement density leads to more data and provides the condition to use data-driven methods for engineering parameters on well production could not be easily considered by traditional simulation methods.
The objective of this study is to optimize of completion parameters by data mining and ensemble machine learning methodologies which are essential for the development of the Montney Formation. Firstly, all the data with more than 80 variables over Canada Wapiti-Montney Tight gas formation have been collected and used for determining the most important engineering parameters by the sensitivity test. In additional, the time series analysis is used to identify the turning time when stimulation dominated effects disappeared in the entire production period. Based on the sensitive test and data mining results, multiple key parameters have been recognized and used as independent variables for the machine learning analysis, such as liner regression, support vector machine, neural networks, Gauss regression, etc. The corresponding assumptions for each learning methods are analyzed, benchmarked and discussed in this paper. In addition, a stacking model which ensemble top 3 best accuracy Machine Learning models is carried out to enhance the accuracy of production forecast ability in Montney Shale formation. During the model training, several feature engineering methods are used to lowering the difficulty for models to obtain knowledge in big data.
Based on the sensitivity analysis results, the following matrices, Stimulated Length (SL), Total Stage Count (TSC), Pumped Proppant per Length (PPL), Pumped Fluid Per Length (PFL) and Injection Rate (IR), are recognized as the most important and sensitive independent variables for production prediction in Wapiti-Montney tight gas formation. The final ensemble model is established by stacking three best individual machine learning algorithms of this study. They are random forest, XGBoost, and Light GBM respectively. The accuracy of prediction by ensemble model could reach as high as 90%, which is much higher than predictions before stacking process.
The application results were encouraging. Three Wapiti horizontal gas well was optimized by the proposed data driven workflow and the cumulative production were improved by 20% around the turning time point. Such new quick evaluation using Ensemble Machine Learning model could optimize the accuracy of prediction and provide simple rules of engagement for Well completion Design Optimization and decision-making throughout the entire development of Montney Tight formation in Wapiti field.