Variability in production performance between wells can be related to geological properties, or to different well and completions design.
Like many other plays, the Utica shale play exhibits a significant variability in petrophysical and fluid properties, primarily due to differential burial. These trends are the primary factors that explain the variability of performance of the wells across the structure.
The primary objective of this paper is to show that the deconvolution of this dominant geological trend from the measured well performance dataset leads to models that can have a significantly higher prediction accuracy (Figure 6), and that this prediction accuracy is mostly gained on the estimation of the influence of the controllable factors (factors upon which the operator has direct control, like completion design, stimulation protocol, well management factors).
The second objective of this paper is to propose solutions to two of the challenging characteristics of the unconventional oil and gas datasets:
The limited number of wells (~600 in this study) compared to the high dimensionality of the problem (45 input variables considered)
The uncertainties associated with the measurement of individual well performance indicators, especially when we are confronted to well-to-well interferences
Figure 1: Global workflowFigure 1 presents a workflow that is shared by most machine learning workflows, but can be used to illustrate the organization of this paper:
The first section of this paper will introduce and support the need for the decomposition of the measured individual well performance indicator into a geological trend component (called g) and a residue (e).
The second and third sections of this paper will focus on the challenges associated with machine learning workflows in the specific context of unconventional oil and gas datasets: feature selection /hyperparameter tuning (see glossary of terms), and analysis of the model performance.
The last section will present the way models were used to identify the influence of multiple variables on the well performance of the well.