Abstract
Sequence Series Data (SSD) refers to multi-dimensional data involving measurements over sequences, which can be ordered. This type of data is frequently encountered in genomic data sets and text sentiment analysis data sets, but collecting them can be time-consuming and labour-intensive. These factors result in low-resolution data sets. Therefore, we employed six machine learning regression methods to perform SSD super-resolution, i.e. to recover high-resolution data sets using self-similarity in low-resolution data sets. Furthermore, we propose a novel Long-Short Term Memory (LSTM) network, namely Interaction Encoded LSTM (IELSTM) network, which is capable of handling multiple distant interactions among sequences. IELSTM network generally shows better overall reconstruction quality when compared with ridge regression, LASSO regression, orthogonal matching pursuit regression, multilayer perceptron regression, and random forest regression, on four genomic data sets.
Original language | English |
---|---|
Title of host publication | 2017 IEEE Symposium Series on Computational Intelligence, SSCI 2017 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1-8 |
Number of pages | 8 |
Volume | 2018-January |
ISBN (Electronic) | 9781538627259 |
DOIs | |
Publication status | Published - 2 Feb 2018 |
Event | 2017 IEEE Symposium Series on Computational Intelligence, SSCI 2017 - United States, Honolulu, United States Duration: 27 Nov 2017 → 1 Dec 2017 |
Conference
Conference | 2017 IEEE Symposium Series on Computational Intelligence, SSCI 2017 |
---|---|
Country/Territory | United States |
City | Honolulu |
Period | 27/11/17 → 1/12/17 |
Other | IEEE |
Funding
This research is supported by General Research Fund (LU310111 and 414413) from the Research Grant Council of the Hong Kong Special Administrative Region and the Lingnan University Direct Grant (DR16A7).