Statistical learning methods have been widely studied and practiced in the past for inferential modeling. In recent years, deep learning methods have been implemented for inferential sensor modeling. As a popular deep learning model, the long short-term memory (LSTM) network is capable of handling data nonlinearity and dynamics and is therefore applied for dynamic inferential modeling. In this paper, we analyze and compare LSTM with other statistical learning methods for the dynamic NOx emission prediction of a 660 MW industrial boiler. Support vector regression (SVR), partial least squares (PLS), and Least absolute shrinkage and selection operator (Lasso) with embedded dynamics are compared with LSTM for dynamic inferential modeling. The experimental results indicate that SVR, PLS, and Lasso outperform LSTM. By disabling the LSTM gates to realize a simple memory structure, the LSTM performance is signifcantly improved. The main goal of the paper is to demonstrate that a deep neural network that is effective in other domains requires close scrutiny and detailed study to show its superiority in process applications.