A frequent problem experienced throughout industry is that of missing or poor quality data in data historians. While this can have many causes, the end result is that data required to perform analyses needed to improve facility operations may be unavailable. This generally occasions delays and wastes valuable time, as the data analyst must manually “clean up” the data before using it, or could even result in erroneous conclusions if the data is used as is without any corrections.

This work has uncovered a dynamic principal component analysis model-based method to detect the presence of erroneous data, identify which sensor is at fault, and reconstruct corrected values for that sensor, to be stored in the historian. However, the dynamic principal component analysis model-based method is not appropriate for all sensors, so a second method for detecting errors in data from a single sensor and calculating corrected values has also been developed. Both methods work on streaming data, and thus make corrections continuously in near real-time. The dynamic principal component analysis model-based method has been successfully tested in the field by injecting errors such as a missing, bias, spike, drift, frozen, etc‥ into real streaming operating data from a Chevron facility. The single sensor data cleansing methodology has not yet undergone field test, but has been tested offline using operating data into which errors such as drift, spike, frozen, missing… have been introduced. Use of these methods can ensure that good quality data for needed analyses is available in the data historian, thereby saving analyst time and assuring that erroneous conclusions are not reached by using faulty data.

You can access this article if you purchase or spend a download.