Summary
Recent predictive analytics and soft computing methods enhanced the exploration of new hydrocarbon reserves. Machine learning (ML) has showed a promising role in oil and gas explorations in recent years. Among the applications, determining a proper location for injection and production wells along with their optimum operating conditions is a complex problem. This research aims to develop a unified process using surrogate proxy models to address this issue. Five robust ML models, (i) extreme gradient boosting (XGBoost), (ii) light gradient boosting machine (LightGBM), (iii) gradient boosting with categorical features support (CatBoost), (iv) support vector regression (SVR), and (v) multilayer perceptron (MLP), are implemented to create surrogate proxy models for estimating the net present value (NPV) of an oil reservoir. A systematic approach is used to find the best-fit hyperparameter inputs for these models. The objective of this method was to refine a broad set of hyperparameters through a random cross-validation search technique. This grid cross-validation method investigates the space narrowed in more accurate intervals. Four reservoir scenarios are considered: (i) production from a single well in a homogeneous reservoir, (ii) production from a single well in a heterogeneous channelized reservoir, (iii) production from multiple wells in a heterogeneous reservoir, and (iv) waterflooding into a heterogeneous reservoir. A reservoir simulator is implemented to create a data set of reservoir realizations with various input parameters (i.e., well location, number of wells‚ production-injection well distance, and interwell angles) in a broad range of operating conditions. The prediction of gradient boosting and MLP models showed a better fit to the simulated data with an R-squared (R2) above 95% in the first three scenarios and 75% in the fourth scenario. The results indicate that the implemented proxies are promising approaches to efficiently estimate the NPV of the reservoir models both during primary and secondary recovery scenarios.