Machine learning techniques are increasingly being used to extract data-driven insights for understanding and optimizing the performance of unconventional reservoirs. It is also generally recognized that multiple machine learning models can provide very similar fits to training or test data, although their performance with respect to future predictions or identification of variable importance can be quite different. To that end, the goal of this study is promote the concept of ensemble learning for aggregating the results of multiple competing models to provide robust predictions and model understanding.

We compare three approaches for model aggregation. M1 uses a simple unweighted average of all model predictions. M2 uses a weighted average of the model predictions with the weights based on an RMSE-based likelihood metric. In M3, starting with an ensemble of likely models that have been fitted to the data, we combine their predictions using a process called stacking. Stacking is a newer approach whereby a set of base models are used to predict the response of interest using raw inputs, and then their predictions are used as predictors in a final model.

The relative performance of these approaches is demonstrated using a production dataset from Wolfcamp shale oil wells which is used for fitting data-driven models to predict cumulative production from the first twelve months as a function of well attributes and completion parameters. The ensemble predictions from M1, M2 and M3 are compared to actual observations of cumulative production of the first year. M3 is found to provide the best match with the observed values, followed by M2 and M1. These aggregating strategies are also found to be more robust compared to individual model predictions.


There is growing interest in the application of data-driven models for describing reservoir behavior in unconventional shale oil and gas wells [1]. This has been largely motivated by the recognition that robust mechanistic models of flow from nano-pores through a network of natural and induced fractures into a multi-stage hydraulically fractured horizontal well are not only computationally challenging but continue to be under active development. In the interim, the number of machine learning applications for such problems as reservoir characterization, production analysis, and predictive maintenance has been increasing steadily. However, a random survey of recent machine learning related articles in the OnePetro database reveals that a majority of studies appear to prefer only one model building technique among a large portfolio of choices. For example, collectively across Ref. [2-11], a wide variety of techniques such as artificial neural networks, high performance random forest, linear logistic regression, support vector regression, multiple adaptive regression spline, gradient boosting machine, deep learning and recurrent neural networks have been used – although only Ref. [5] and [9] used more than one technique in their studies. A similar observation was also made in Ref. [12] – see table 1 therein. It is worth noting that this has turned out to be the case even though many models may end up with similar goodness-of-fit statistics and there is no a priori way to choose the best technique for the problem at hand – as clearly demonstrated in our previous work [13, 14]. In particular, the value of using multiple models for obtaining a robust understanding of variable importance was shown in Ref. [14].

You can access this article if you purchase or spend a download.