Gaussian feature learning based on variational autoencoder for improving nonlinear process monitoring
Introduction
The scale of the modern industrial systems has become more and more complex. Timely detection of faults in these systems is critical to ensuring the safety of people and property, and improving product quality [1], [2], [3]. Traditional methods based on process mechanistic or models are not always available or may be difficult to construct due to the high complexity of the system [4]. With the wide application of distributed control systems (DCS) and advances in computer technology, a large amount of industrial data is collected and stored. As a result, alternative monitoring methods based on data have received a great deal of attention, especially methods based on multivariate statistical process monitoring (MSPM). MSPM mines the correlations among the system variables in historical normal data, generates key feature sets, and eventually builds the monitoring model [5], [6], [7], [8], [9].
Among many MSPM methods, principal component analysis (PCA) serves as a basic monitoring method [4], [10], [11]. In general, PCA transforms data into two parts: one converts the relevant variables into a series of orthogonal variables (principal components) and the other stores the residual information (residual space). The principal components retain as much data variability as possible, and the Hotelling's T2 statistic is typically constructed in this space. The Q-statistic, also known as the squared prediction error (SPE), is used in residual space to make up the overly sensitive to inaccuracies for T2 statistic. Because of its simplicity and effectiveness, PCA has been used in many industrial processes. However, the conventional PCA-based monitoring methods are based on the assumption that process variables are linear and Gaussian distributed. In a real complicated industrial process, characteristics of the collected data are very complex, and the internal variables are highly nonlinear and do not satisfy the Gaussian distribution. In this case, PCA cannot fully characterize the data and thus has poor monitoring performance.
To address the nonlinear problem mentioned above, different types of nonlinear methods have been proposed. The first representative nonlinear method is based on earlier neural network (NN) approaches. For example, Kramer [12] proposed the autoassociative neural network as a nonlinear PCA method, Dong and McAvoy [13] combined the principal curve and the neural network to tackle the nonlinear problem, and Geng and Zhu [14] proposed the NLPCA method that is also based on the neural network. These methods based on the traditional neural network need to develop the model offline and train the model through some optimization methods. At that time, due to the poor performance of computers, the small amount of data, and the limitations of the traditional neural network technology, fewer and fewer researchers paid attention to the NN-based methods. In contrast, the kernel learning method, especially the method based on kernel PCA (KPCA), has attracted the attention of many researchers [15], [16], [17], [18], [19], [20]. By introducing a nonlinear kernel function, KPCA first maps the original data to a high-dimensional feature space, and then uses the PCA algorithm to set up the process monitoring model in this space. Similar to PCA, two monitoring statistics are constructed separately for monitoring the main information part and the noisy part. KPCA's model is easy to implement and various types of nonlinearity can be modeled, so it is used in many process monitoring applications. However, it is very difficult to choose a suitable kernel function, and the inappropriate kernel function will not correctly reflect the characteristics of the process data [21], [22]. Another type of nonlinear method is linear approximation approach (LAA) [23], [24]. In this method, several local linear models are used to approximate the entire nonlinear space. When the local linear models are established, Bayesian inference is used to integrate the results of local models to perform fault detection for the entire process. LAA is easier to implement than NN and KPCA, but the disadvantages are obvious. First, the number of local models is difficult to determine; second, it may not be able to fully reflect the data's nonlinearity.
More recently, due to the advances in computer technology, the development of deep neural network algorithms, and the growth of collected and stored data, a new powerful machine learning algorithm called deep learning (DL) has achieved great success in many applications [25], [26], [27], [28], [29], [30]. Deep learning technology abstracts the original data into low-level feature representations through multi-level feature representation, and uses these features to mine the intricacies of the data. Up to now, some scholars have already applied deep learning to the field of nonlinear process monitoring and achieved good results [31]. Yan et al. [32] proposed a nonlinear monitoring method based on variant autoencoders. Lv et al. [33] used stacked sparse autoencoders to perform nonlinear fault detection and diagnosis. Jiang et al. [34] proposed a semi-supervised fault classification method based on sparse autoencoders to tackle the dynamic nonlinear problem. Among various of DL-based nonlinear process monitoring methods, autoencoder (AE) plays a central role. It has the main advantages:
- (1)
Its model is completely trained on process data and able to learn nonlinear features from the data automatically, which is very helpful for discovering intricate information inside high-dimensional nonlinear data.
- (2)
It provides an initial way for deep neural networks to make models deeper through the stacking way.
- (3)
It is suitable for mining big data, generally speaking, the more data, the better the generalization ability of the model.
When performing fault detection, the T2 statistic, also called H2 [32], is constructed in the AE's feature space. However, the control limit of the T2 statistic is estimated under the assumption that the features follow Gaussian distribution. In AE, due to nonlinear transformation, it is difficult to ensure that the extracted features follow Gaussian distribution. Therefore, the control limit of T2 statistic cannot be determined by the known F distribution.
To address the problem stated above, this paper proposes a novel nonlinear process monitoring method based on variational autoencoder (VAE) [35]. VAE is a powerful generative model that has been successfully used in many applications [35], [36], [37], [38]. It comes from Bayesian inference, and wants to model the potential probability distribution of data so that it can generate new samples from this distribution. The Gaussian distribution restriction is added to the hidden layer of VAE so that the features learned by VAE follow the Gaussian distribution. Therefore, regardless of the distribution of the raw data, the features extracted by VAE follow the Gaussian distribution. Moreover, the control limit of the newly constructed statistic H2 in the feature space can be easily determined by a χ2 distribution. In this paper, the VAE model is first trained on the normal dataset to extract key Gaussian features. Next, a new monitoring statistic H2 is constructed based on the Gaussian feature representations with the corresponding control limit determined by a χ2 distribution.
The remainder of this paper is organized as follows. First, a brief review of AE is given in Section 2. Section 3 details the proposed method for nonlinear process monitoring, including the development of the VAE model, the construction and control limit estimation of the H2 statistic, and the entire monitoring strategy based on VAE. In Section 4, a nonlinear numerical system and the TE benchmark process are provided to demonstrate the efficiency of the proposed method. Some in-depth discussion of the proposed method is then given in Section 5. Finally, some conclusions are drawn.
Section snippets
Preliminaries
This section provides an overview of the basic autoencoder.
VAE-based nonlinear process monitoring
This section introduces the detail of the proposed VAE-based process monitoring method. The original VAE technique from the perspective of generated model is firstly described. Then, in order to illustrate the essential difference between the VAE used in this study and the original VAE, we have newly introduced the VAE from the perspective of feature extraction. The construction of the monitoring statistic for the VAE is then given, followed by its estimation of the control limit. Finally, the
Case studies
In this section, two case studies are employed to verify the monitoring performance of the proposed method. The first one is a nonlinear numerical system originally suggested by Ge et al. [24]. The other one is the TE benchmark process, which has been widely used as an experiment platform for process monitoring.
Discussion
This section discusses the results of simulation experiments for the proposed method in more detail. First, the verification process of the Gaussian distribution of features extracted by the VAE is performed. We then explore the relationship among features themselves and the relationship between features and original variables. Finally, a preliminary study on fault identification based on VAE is given.
Conclusion
In this paper, a novel nonlinear process monitoring method based on VAE is proposed to address the Gaussian assumption problem. Different from the traditional AE that cannot guarantee that the features satisfy Gaussian distribution, VAE makes the feature representations follow the Gaussian distribution by adding a K-L divergence to hidden layer output. Therefore, VAE is adapted and trained in normal data to extract key Gaussian features in the proposed method. Based on these features, a new
Acknowledgement
This work was supported by the National Natural Science Foundation of China (51777122).
References (48)
Survey on data-driven industrial process monitoring and diagnosis
Annu. Rev. Control
(2012)- et al.
A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process
J. Process Control
(2012) Review on data-driven modeling and monitoring for plant-wide industrial processes
Chemomet. Intell. Lab. Syst.
(2017)- et al.
Nonlinear principal component analysis-based on principal curves and neural networks
Comput. Chem. Eng.
(1996) - et al.
Nonlinear process monitoring using kernel principal component analysis
Chem. Eng. Sci.
(2004) - et al.
Improved kernel PCA-based monitoring approach for nonlinear processes
Chem. Eng. Sci.
(2009) - et al.
Weighted kernel principal component analysis based on probability density estimation and moving window and its application in nonlinear chemical process monitoring
Chemom. Intell. Lab. Syst.
(2013) - et al.
Non-linear generalization of principal component analysis: from a global to a local approach
J. Sound Vib.
(2002) - et al.
Nonlinear process monitoring based on linear subspace and Bayesian inference
J. Process Control
(2010) - et al.
A deep learning-based multi-model ensemble method for cancer prediction
Comput. Methods Programs Biomed.
(2018)
Automated feature learning for nonlinear process monitoring – an approach using stacked denoising autoencoder and k-nearest neighbor rule
J. Process Control
Nonlinear and robust statistical process monitoring based on variant autoencoders
Chemom. Intell. Lab. Syst.
Semi-supervised fault classification based on dynamic sparse stacked auto-encoders model
Chemom. Intell. Lab. Syst.
Improved principal component monitoring using the local approach
Automatica
A plant-wide industrial process control problem
Comput. Chem. Eng.
Plant-wide control of the Tennessee Eastman problem
Comput. Chem. Eng.
Analysis and generalization of fault diagnosis methods for process monitoring
J. Process Control
Review of recent research on data-based process monitoring
Ind. Eng. Chem. Res.
Statistical process monitoring: basics and beyond
J. Chemom.
Data-driven Methods for Fault Detection and Diagnosis in Chemical Processes
A review on basic data-driven approaches for industrial process monitoring
IEEE Trans. Ind. Electron.
Data mining and analytics in the process industry: the role of machine learning
IEEE Access
Overview of process fault diagnosis
Unsupervised Process Monitoring and Fault Diagnosis with Machine Learning Methods
Gaussian and non-Gaussian double subspace statistical process monitoring based on principal component analysis and independent component analysis
Ind. Eng. Chem. Res.
Cited by (75)
Anomaly detection using large-scale multimode industrial data: An integration method of nonstationary kernel and autoencoder
2024, Engineering Applications of Artificial IntelligenceFault diagnosis based on counterfactual inference for the batch fermentation process
2024, ISA TransactionsUnsupervised heat balance indicator construction based on variational autoencoder and its application to aluminum electrolysis process monitoring
2024, Engineering Applications of Artificial IntelligenceProcess monitoring using recurrent Kalman variational auto-encoder for general complex dynamic processes
2023, Engineering Applications of Artificial IntelligenceDomain adaptation for few-sample nonlinear process monitoring with deep networks
2023, Information Sciences