A Hierarchical Taxonomy For Deep State Space Models

Shiqin TANG, Pengxing FENG, Shujian YU, Yining DONG, S. Joe QIN

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review

Abstract

Modeling nonlinear dynamical systems is a challenging task in fields such as speech processing, music generation, and video prediction. This paper introduces a hierarchical framework for Deep State Space Models (DSSMs), categorizing them by their conditional independence properties and Markov assumptions and positioning existing models within this framework, including the Stochastic Recurrent Neural Network (SRNN), Variational Recurrent Neural Network (VRNN), and Recurrent State Space Model (RSSM). We discuss different options for the inference networks and demonstrate how integrating normalizing flows can enhance model flexibility by capturing complex distributions. Our work not only clarifies the relationships among existing models but also paves the way for the development of new, more effective approaches for modeling nonlinear dynamics. In particular, we propose the Autoregressive State Space Model (ArSSM) and evaluate its effectiveness in speech and polyphonic music modeling tasks.
Original languageEnglish
Title of host publicationProceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublisherIEEE
ISBN (Electronic)9798350368741
DOIs
Publication statusE-pub ahead of print - 7 Mar 2025

Fingerprint

Dive into the research topics of 'A Hierarchical Taxonomy For Deep State Space Models'. Together they form a unique fingerprint.

Cite this