Abstract
In this paper, we propose a stabilization strategy for lasso to use cross-validation (CV) for structure learning. It is known that cross-validation often prefers very small λ that selects an excessively large number of variables, which is also in a less stable region of λ. In this paper, we propose to reduce the heterogeneity of the model structures during the CV step. We first build a series of models using all data with a grid of λ. Then the models of all CV-folds use a revised lasso objective that penalizes deviations from the model structure using all data. Further, we propose a stable selection criterion that uses CV prediction errors jointly with a stability measure to select the most stable model with near minimum CV errors. The proposed strategy is demonstrated using data from an industrial boiler process to predict NOx emissions.
Original language | English |
---|---|
Pages (from-to) | 228-233 |
Number of pages | 6 |
Journal | IFAC-PapersOnLine |
Volume | 54 |
Issue number | 7 |
Early online date | 15 Sept 2021 |
DOIs | |
Publication status | Published - 2021 |
Externally published | Yes |
Event | 19th IFAC Symposium on System Identification (SYSID 2021) - Padova, Italy Duration: 13 Jul 2021 → 16 Jul 2021 |
Bibliographical note
Financial support for this work from the City University of Hong Kong under Project 9380123: Bridging between Systems Theory and Dynamic Data Learning towards Industrial Intelligence and Industry 4.0 and an NSF-China Regional Joint Key Project for Innovations and Development (U20A20189) is gratefully acknowledged.Keywords
- Inferential sensors
- Stable cross-validation
- Stable lasso
- Statistical machine learning