Qualification of good and badlog data is essential for both single well and multi-well interpretation and workflows. We use a combination of machine learning methods, each using single or multiple input curves to identify sections or samples of bad data.
To automatically identify samples or sections with bad data, using rule-based and/or statistical methods can be powerful but not sufficient. The human brain can recognize patterns in curves or curve relationships that are difficult and perhaps impossible to identify by rules alone. Machine learning can be used to mimic human strategies in a way that can potentially match and even exceed the human brain in accuracy but with the efficiency of computer processing.
In this work we have focused on identification of badlogs in three types of well logs - bulk density, compressional slowness, and shear slowness. The motivation behind this selection is that these logs usually undergo a detailed manual quality control over the full coverage of the logs in the process of generating continuous logs for geophysical studies. The objective has been to let the machine learn from and mimic this manual quality control in order to assist petrophysicists in this process.
What is good enough data might vary between different companies, measurements, workflows, petrophysicists and lithostratigraphic units. Different strategies can be used to apply the result of quality control. One can remove the section with bad data, one can mark the section with a BadHoleFlag, or flag the individual logs with BadLogFlag and leave the data untouched. We find the BadLogFlag to be far superior to the other options as it is non-destructive, easily modifiable, transparent, creates little extra data and it enables one to choose different strategies for how to handle these data in the next step.
Our badlog flagging solution is a combination of several unsupervised machine learning methods, addressing several reasons why a sample is anomalous. These methods can be divided into two groups based on the algorithms they use: heuristic methods and anomaly detection methods. The former tries to reproduce what a human interpreter would do in applying thresholds based on combinations of curves. The latter applies several machine learning outlier detection techniques using different combinations of input curves. These are thereafter combined through a voting process to flag a sample as anomalous, where a vote constitutes a score of abnormality of each sample.