The formation cementation factor, denoted by m in Archie's equation, is obtained from special core analysis. The process is costly, requires enormous manpower, and takes time. As alternatives, statistical relationships and empirical correlations have been derived to estimate m. Such derivations are usually based on porosity and resistivity logs, hence underutilizing the possible contribution of other conventional logs. They are also based on the assumption that the relationship between the logs and m is linear. Other methods, including those based on machine learning, have used a suite of logs to estimate m. We postulate that since not all the logs are sensitive to the variations in m it is necessary to determine an optimal subset that can be used to more accurately estimate m. A comparative study of linear and nonlinear methods of feature selection is presented. This is achieved using an integrated machine learning workflow that combines the nonlinear feature selection capability of functional networks (FNs) with the traditional feed-forward back-propagation algorithm of artificial neural networks (ANNs) to build an FN-ANN hybrid machine learning approach. Datasets collected from various publications were split into training and validation subsets. The training subset is used to build and optimize the models while the validation subset is used to test the models. The results of the proposed model are compared to multivariate linear regression (MLR), an ANN fed with all input variables, and an ANN fed with a subset of input variables selected based on linear correlation (LC-ANN). In a hybrid fashion, the FN algorithm extracts the best subset of input logs based on nonlinear mapping with m using the least squares fitting criterion. The extracted feature subset is used to predict m for uncored wells. The results show that the FN-ANN hybrid model outperforms the other models. Evaluating the performance of the models on two test cases, the FN-ANN model has the least root mean square error compared to the other models. The results prove that the relationship between the input logs and m is nonlinear. It therefore confirms the efficiency of the nonlinear variable selection process based on the FN algorithm. Therefore, m could not be reasonably estimated by using linear models such as MLR and LC-ANN. In addition to the improved accuracy, the FN-ANN hybrid model also provides a simpler and more computationally compact model for the prediction of m. This approach will contribute significantly to the petroleum industry's digital transformation agenda as it effectively handles the computational challenges associated with big data and multidimensional feature space.

You can access this article if you purchase or spend a download.