Hydraulic fracturing pumping data is recorded and mapped in the field at one-second intervals. The designation of the stage start and end time is very important because these boundaries govern summary calculations, such as pressure, rate, and concentrations. Manual selection of staging flags is often very time consuming and prone to inaccuracies due to the lack of uniform selection and interpretation methods across the industry. The purpose of this study is to demonstrate the automation process to identify accurate and consistent stage start and end times in a high-frequency treating plot using machine learning algorithms.
This study is based on the analysis of metered high-frequency treatment data coupled with supervised machine learning algorithms. The pumping dataset includes treating pressure, slurry rate, and clean volume for 179 stages, for a total of 1,530,445 rows of data per variable. Sixty-six percent of the data were used to train the model, eight percent were used to validate the model, and the remaining twenty-six percent were used to test it. Subject matter expertise, taking into account user-defined start/end time flags, was used to train the algorithm.
Pumping data behaves very differently than traditional time-series data such as weather, stock prices, or population growth. The features examined are not affected by time but by physical events, so the correlation or dependency between features can affect accurate pattern recognition. To allow the algorithm to run leaner, the dataset was pre-processed using loss functions, smoothing techniques, and the rate of change of the main data channels. To understand how data may impair the predictions and to evaluate different model performances, we tested two classification algorithms: logistic regression and support vector machine.
Classification techniques were used to generate an accurate suggestion of where the pumping of a hydraulic fracturing stage starts and ends in a high-frequency treating plot. Results show that flag predictions have a training and validation accuracy of approximately 90 percent using logistic regression algorithms. The predicted flags were within 10 seconds of the manual selected flag. A limitation of this method is that it requires periodic retraining with new field data to improve the prediction robustness and to maintain high accuracy.
Accurate start and end time selections make it not only viable to process large volumes of fracture treatment data but also reduce the time spent reviewing field data for quality control. Petroleum engineers need to continue their focus on optimizing their systems with the greatest possible efficiency. Leveraging common analytical methods combined with the large, structured datasets that are readily available provide impressive results without extensive programming knowledge.