Elsevier

Journal of Process Control

Volume 75, March 2019, Pages 136-155
Journal of Process Control

Gaussian feature learning based on variational autoencoder for improving nonlinear process monitoring

https://doi.org/10.1016/j.jprocont.2019.01.008Get rights and content

Highlights

  • Gaussian feature learning based on variational autoencoder for improving nonlinear process monitoring.

  • VAE is used to automatically learn the patterns inherent in the nonlinear process and extract Gaussian features.

  • New monitoring statistic that is H2 is constructed, whose control limit can be easily determined by a χ2 distribution.

  • The effectiveness of the proposed method is verified by two case studies including a nonlinear numerical example and TE benchmark process.

Abstract

Deep learning algorithms, especially the autoencoders, have been applied in nonlinear process monitoring recently. However, the features extracted by the autoencoders can hardly follow the Gaussian distribution, consequently, the control limit of the corresponding monitoring statistic can not be determined by an F or χ2 distribution. Recent improvements in the unsupervised learning domain of deep learning offer opportunities to avoid the problem. In this paper, a novel nonlinear process monitoring method based on variational autoencoder (VAE) is proposed to tackle the Gaussian assumption problem. Due to the Gaussian distribution limitation added in the hidden layer of the VAE, it can not only automatically learn the key features of the nonlinear system, but also learn features that follow the Gaussian distribution. The Gaussian feature representations obtained from VAE are then provided to construct a new statistic H2 whose control limit can be easily determined by a χ2 distribution. A nonlinear numerical study and the TE benchmark process have verified the effectiveness of the proposed method.

Introduction

The scale of the modern industrial systems has become more and more complex. Timely detection of faults in these systems is critical to ensuring the safety of people and property, and improving product quality [1], [2], [3]. Traditional methods based on process mechanistic or models are not always available or may be difficult to construct due to the high complexity of the system [4]. With the wide application of distributed control systems (DCS) and advances in computer technology, a large amount of industrial data is collected and stored. As a result, alternative monitoring methods based on data have received a great deal of attention, especially methods based on multivariate statistical process monitoring (MSPM). MSPM mines the correlations among the system variables in historical normal data, generates key feature sets, and eventually builds the monitoring model [5], [6], [7], [8], [9].

Among many MSPM methods, principal component analysis (PCA) serves as a basic monitoring method [4], [10], [11]. In general, PCA transforms data into two parts: one converts the relevant variables into a series of orthogonal variables (principal components) and the other stores the residual information (residual space). The principal components retain as much data variability as possible, and the Hotelling's T2 statistic is typically constructed in this space. The Q-statistic, also known as the squared prediction error (SPE), is used in residual space to make up the overly sensitive to inaccuracies for T2 statistic. Because of its simplicity and effectiveness, PCA has been used in many industrial processes. However, the conventional PCA-based monitoring methods are based on the assumption that process variables are linear and Gaussian distributed. In a real complicated industrial process, characteristics of the collected data are very complex, and the internal variables are highly nonlinear and do not satisfy the Gaussian distribution. In this case, PCA cannot fully characterize the data and thus has poor monitoring performance.

To address the nonlinear problem mentioned above, different types of nonlinear methods have been proposed. The first representative nonlinear method is based on earlier neural network (NN) approaches. For example, Kramer [12] proposed the autoassociative neural network as a nonlinear PCA method, Dong and McAvoy [13] combined the principal curve and the neural network to tackle the nonlinear problem, and Geng and Zhu [14] proposed the NLPCA method that is also based on the neural network. These methods based on the traditional neural network need to develop the model offline and train the model through some optimization methods. At that time, due to the poor performance of computers, the small amount of data, and the limitations of the traditional neural network technology, fewer and fewer researchers paid attention to the NN-based methods. In contrast, the kernel learning method, especially the method based on kernel PCA (KPCA), has attracted the attention of many researchers [15], [16], [17], [18], [19], [20]. By introducing a nonlinear kernel function, KPCA first maps the original data to a high-dimensional feature space, and then uses the PCA algorithm to set up the process monitoring model in this space. Similar to PCA, two monitoring statistics are constructed separately for monitoring the main information part and the noisy part. KPCA's model is easy to implement and various types of nonlinearity can be modeled, so it is used in many process monitoring applications. However, it is very difficult to choose a suitable kernel function, and the inappropriate kernel function will not correctly reflect the characteristics of the process data [21], [22]. Another type of nonlinear method is linear approximation approach (LAA) [23], [24]. In this method, several local linear models are used to approximate the entire nonlinear space. When the local linear models are established, Bayesian inference is used to integrate the results of local models to perform fault detection for the entire process. LAA is easier to implement than NN and KPCA, but the disadvantages are obvious. First, the number of local models is difficult to determine; second, it may not be able to fully reflect the data's nonlinearity.

More recently, due to the advances in computer technology, the development of deep neural network algorithms, and the growth of collected and stored data, a new powerful machine learning algorithm called deep learning (DL) has achieved great success in many applications [25], [26], [27], [28], [29], [30]. Deep learning technology abstracts the original data into low-level feature representations through multi-level feature representation, and uses these features to mine the intricacies of the data. Up to now, some scholars have already applied deep learning to the field of nonlinear process monitoring and achieved good results [31]. Yan et al. [32] proposed a nonlinear monitoring method based on variant autoencoders. Lv et al. [33] used stacked sparse autoencoders to perform nonlinear fault detection and diagnosis. Jiang et al. [34] proposed a semi-supervised fault classification method based on sparse autoencoders to tackle the dynamic nonlinear problem. Among various of DL-based nonlinear process monitoring methods, autoencoder (AE) plays a central role. It has the main advantages:

  • (1)

    Its model is completely trained on process data and able to learn nonlinear features from the data automatically, which is very helpful for discovering intricate information inside high-dimensional nonlinear data.

  • (2)

    It provides an initial way for deep neural networks to make models deeper through the stacking way.

  • (3)

    It is suitable for mining big data, generally speaking, the more data, the better the generalization ability of the model.

When performing fault detection, the T2 statistic, also called H2 [32], is constructed in the AE's feature space. However, the control limit of the T2 statistic is estimated under the assumption that the features follow Gaussian distribution. In AE, due to nonlinear transformation, it is difficult to ensure that the extracted features follow Gaussian distribution. Therefore, the control limit of T2 statistic cannot be determined by the known F distribution.

To address the problem stated above, this paper proposes a novel nonlinear process monitoring method based on variational autoencoder (VAE) [35]. VAE is a powerful generative model that has been successfully used in many applications [35], [36], [37], [38]. It comes from Bayesian inference, and wants to model the potential probability distribution of data so that it can generate new samples from this distribution. The Gaussian distribution restriction is added to the hidden layer of VAE so that the features learned by VAE follow the Gaussian distribution. Therefore, regardless of the distribution of the raw data, the features extracted by VAE follow the Gaussian distribution. Moreover, the control limit of the newly constructed statistic H2 in the feature space can be easily determined by a χ2 distribution. In this paper, the VAE model is first trained on the normal dataset to extract key Gaussian features. Next, a new monitoring statistic H2 is constructed based on the Gaussian feature representations with the corresponding control limit determined by a χ2 distribution.

The remainder of this paper is organized as follows. First, a brief review of AE is given in Section 2. Section 3 details the proposed method for nonlinear process monitoring, including the development of the VAE model, the construction and control limit estimation of the H2 statistic, and the entire monitoring strategy based on VAE. In Section 4, a nonlinear numerical system and the TE benchmark process are provided to demonstrate the efficiency of the proposed method. Some in-depth discussion of the proposed method is then given in Section 5. Finally, some conclusions are drawn.

Section snippets

Preliminaries

This section provides an overview of the basic autoencoder.

VAE-based nonlinear process monitoring

This section introduces the detail of the proposed VAE-based process monitoring method. The original VAE technique from the perspective of generated model is firstly described. Then, in order to illustrate the essential difference between the VAE used in this study and the original VAE, we have newly introduced the VAE from the perspective of feature extraction. The construction of the monitoring statistic for the VAE is then given, followed by its estimation of the control limit. Finally, the

Case studies

In this section, two case studies are employed to verify the monitoring performance of the proposed method. The first one is a nonlinear numerical system originally suggested by Ge et al. [24]. The other one is the TE benchmark process, which has been widely used as an experiment platform for process monitoring.

Discussion

This section discusses the results of simulation experiments for the proposed method in more detail. First, the verification process of the Gaussian distribution of features extracted by the VAE is performed. We then explore the relationship among features themselves and the relationship between features and original variables. Finally, a preliminary study on fault identification based on VAE is given.

Conclusion

In this paper, a novel nonlinear process monitoring method based on VAE is proposed to address the Gaussian assumption problem. Different from the traditional AE that cannot guarantee that the features satisfy Gaussian distribution, VAE makes the feature representations follow the Gaussian distribution by adding a K-L divergence to hidden layer output. Therefore, VAE is adapted and trained in normal data to extract key Gaussian features in the proposed method. Based on these features, a new

Acknowledgement

This work was supported by the National Natural Science Foundation of China (51777122).

References (48)

  • Z. Zhang et al.

    Automated feature learning for nonlinear process monitoring – an approach using stacked denoising autoencoder and k-nearest neighbor rule

    J. Process Control

    (2018)
  • W. Yan et al.

    Nonlinear and robust statistical process monitoring based on variant autoencoders

    Chemom. Intell. Lab. Syst.

    (2016)
  • L. Jiang et al.

    Semi-supervised fault classification based on dynamic sparse stacked auto-encoders model

    Chemom. Intell. Lab. Syst.

    (2017)
  • U. Kruger et al.

    Improved principal component monitoring using the local approach

    Automatica

    (2007)
  • J.J. Downs et al.

    A plant-wide industrial process control problem

    Comput. Chem. Eng.

    (1993)
  • P.R. Lyman et al.

    Plant-wide control of the Tennessee Eastman problem

    Comput. Chem. Eng.

    (1995)
  • C.F. Alcala et al.

    Analysis and generalization of fault diagnosis methods for process monitoring

    J. Process Control

    (2011)
  • Z. Ge et al.

    Review of recent research on data-based process monitoring

    Ind. Eng. Chem. Res.

    (2013)
  • S. Joe Qin

    Statistical process monitoring: basics and beyond

    J. Chemom.

    (2003)
  • E.L. Russell et al.

    Data-driven Methods for Fault Detection and Diagnosis in Chemical Processes

    (2000)
  • S. Yin et al.

    A review on basic data-driven approaches for industrial process monitoring

    IEEE Trans. Ind. Electron.

    (2014)
  • Z. Ge et al.

    Data mining and analytics in the process industry: the role of machine learning

    IEEE Access

    (2017)
  • C. Aldrich et al.

    Overview of process fault diagnosis

    Unsupervised Process Monitoring and Fault Diagnosis with Machine Learning Methods

    (2013)
  • J. Huang et al.

    Gaussian and non-Gaussian double subspace statistical process monitoring based on principal component analysis and independent component analysis

    Ind. Eng. Chem. Res.

    (2015)
  • Cited by (75)

    View all citing articles on Scopus
    View full text