Publicly available oil and gas production data in Texas is a well-known obstacle when evaluating individual well performance. Many engineers and geologists within the oil and gas industry disregard the data source, believing it to be dishonest or unreliable due to the lease allocation process and weak reporting regulations. However, in this paper we present a methodology to responsibly exploit this vast public data source for strategic purposes using outlier identification, probabilistic forecasting tools, and Bayesian calibration to refine our analysis of multi-fractured horizontal well performance in the Midland Basin.
A significant amount of operator performance analysis is currently being undertaken by the investment banking community and other third-party, non-operating companies. Our internal work to understand operator well performance in the Midland Basin in space and time is likely to have a unique perspective given access to higher quality data for better estimation and calibration of the public production data set. The work includes analysis of data quality issues such as lease production allocation and heuristics that were developed for acceptance of data for type curve generation. A machine learning algorithm is used to perform probabilistic decline curve analysis in order to provide an objective forecast evaluation of horizontal well production. We demonstrate partial calibration of the data set to higher resolution daily data and Bayesian updating of every forecast in the analysis as additional data is incorporated each month. Repeated performance accuracy tracking and vintaging to show stability and predictability of the forecasts increases confidence in the methodology and data set.
Our work reveals that recently drilled wells (c. 2015–2016) are forecasted to recover significantly more reserves - nearly twice as much in some areas - as compared to early asset developments. Is this due to improved operator practices or higher-quality resources? The ability to answer this question is where we find the true value of public data.