Profiling of RNAs improves understanding of cellular mechanisms, which can be essential to cure various diseases. It is estimated to take years to fully characterize the three-dimensional structure of around 200,000 RNAs in human using the mutate-and-map strategy. In order to speed up the profiling process, we propose a solution based on super-resolution. We applied five machine learning regression methods to perform RNA structure profiling super-resolution, i.e. to recover the whole data sets using self-similarity in low-resolution (undersampled) data sets. In particular, our novel Interaction Encoded Long-Short Term Memory (IELSTM) network can handle multiple distant interactions in the RNA sequences. When compared with ridge regression, LASSO regression, multilayer perceptron regression, and random forest regression, IELSTM network can reduce the mean squared error and the median absolute error by at least 33% and 31% respectively in three RNA structure profiling data sets.
|Lecture Notes in Computer Science (LNCS) book series
|6th International Conference on Theory and Practice of Natural Computing
|18/12/17 → 20/12/17
|Institute of Computer Science. Czech Academy of Sciences
- Long-short term memory
- Machine learning regression methods
- RNA structure