In this paper, we propose a stabilization strategy for lasso to use cross-validation (CV) for structure learning. It is known that cross-validation often prefers very small λ that selects an excessively large number of variables, which is also in a less stable region of λ. In this paper, we propose to reduce the heterogeneity of the model structures during the CV step. We first build a series of models using all data with a grid of λ. Then the models of all CV-folds use a revised lasso objective that penalizes deviations from the model structure using all data. Further, we propose a stable selection criterion that uses CV prediction errors jointly with a stability measure to select the most stable model with near minimum CV errors. The proposed strategy is demonstrated using data from an industrial boiler process to predict NOx emissions.
|Number of pages||6|
|Early online date||15 Sept 2021|
|Publication status||Published - 2021|
|Event||19th IFAC Symposium on System Identification (SYSID 2021) - Padova, Italy|
Duration: 13 Jul 2021 → 16 Jul 2021
Bibliographical noteFinancial support for this work from the City University of Hong Kong under Project 9380123: Bridging between Systems Theory and Dynamic Data Learning towards Industrial Intelligence and Industry 4.0 and an NSF-China Regional Joint Key Project for Innovations and Development (U20A20189) is gratefully acknowledged.
- Inferential sensors
- Stable cross-validation
- Stable lasso
- Statistical machine learning