Journal of Process Control, Vol.79, 41-55, 2019
A heterogeneous benchmark dataset for data analytics: Multiphase flow facility case study
Improvements in sensing, connectivity and computing technologies mean that industrial processes now generate data from a variety of disparate sources. Data may take a number of forms, from time-domain signals, sampled at various rates using a variety of sensors, to alarm and event logs. Novel techniques need to be developed to tackle the challenges of heterogeneous data. Testing such algorithms requires bench-mark datasets that allow direct comparison of the performance of the methods. This work presents the PRONTO heterogeneous benchmark dataset. Experiments were conducted on a multiphase flow facility under various operational conditions with and without induced faults. Data were collected from heterogeneous sources, including process measurements, alarm records, high frequency ultrasonic flow and pressure measurements. The presented dataset is suitable for developing and validating algorithms for fault detection and diagnosis and data fusion concepts. Three algorithms are tested using the dataset, illustrating the applicability of the dataset. (C) 2019 The Authors. Published by Elsevier Ltd.