Abstract
In this paper, we propose a statistical learning procedure that integrates process knowledge for the Dow data challenge problem presented in Braun et al. (2020). The task is to build an accurate inferential sensor model to predict the impurity in the product stream with apparent drifts. The proposed method consists of i) process data exploratory analysis, ii) a method for variable selection, iii) a method to deal with non-negative physical property modeling using a softplus function; and iv) a method for online bias updating based on known data. We make use of process operation knowledge in all steps of data analytics, including exploratory analysis and feature selection. We report the detection of equipment-switching operations in the data and interpolations found in the impurity data. Partial least squares (PLS) and least angle regression solution (LARS) are adopted to model the data with strong collinearity. Pros and cons of LARS and PLS are given with practical implications.
Original language | English |
---|---|
Article number | 107451 |
Journal | Computers and Chemical Engineering |
Volume | 153 |
DOIs | |
Publication status | Published - Oct 2021 |
Externally published | Yes |
Funding
The first author acknowledges the financial support for this work from the City University of Hong Kong under Project 9380123 : Bridging between Systems Theory and Dynamic Data Learning towards Industrial Intelligence and Industry 4.0 and the NSF China Grant U20A201398 , Big data-driven abnormal situation intelligent diagnosis and self-healing control for process industries.
Keywords
- Least angle regression
- Partial least squares
- Process knowledge
- Statistical machine learning
- Variable selection