PathEmb : Random Walk based Document Embedding for Global Pathway Similarity Search

Jiao ZHANG, Sam KWONG, Guangming LIU, Qiuzhen LIN, Ka-Chun WONG

Research output: Journal PublicationsJournal Article (refereed)peer-review

8 Citations (Scopus)

Abstract

Pathway analysis is a cornerstone of system biology. In particular, pathway similarity search plays a key role in establishing structural, functional, and evolutionary relationships between different biological entities. Given a query pathway as well as a database, a pathway similarity search aims to identify novel pathways that are homologous to the query pathway. Unfortunately, the pathway similarity search is computationally inefficient due to the NP-complete graph isomorphism problem. In this current study, we introduce a novel algorithmic framework for pathway similarity search, named PathEmb (Pathway Embedding), which is analogous to the Skip-gram model where each pathway is represented as a "document". PathEmb exploits a second order random walk strategy to explore diverse pathway patterns. All signaling paths traversed from random walks are regarded as "sentences", which are constituted as a "document" afterwards. Then, the "document" pattern for the individual pathway is mapped into a low-dimensional feature space for downstream tasks. Furthermore, PathEmb is a topology-free pathway similarity search algorithm, which is feasible to handle any pathway with arbitrary structure. We have extensively evaluated PathEmb and other cutting-edge methods on three pathway datasets. The experimental results demonstrate that PathEmb outperforms the existing methods in terms of computational efficiency and search accuracy. The source codes of PathEmb are freely available online https://github.com/zhangjiaobxy/PathEmb.
Original languageEnglish
Pages (from-to)1329-1335
JournalIEEE Journal of Biomedical and Health Informatics
Volume23
Issue number3
Early online date27 Apr 2018
DOIs
Publication statusPublished - May 2019
Externally publishedYes

Funding

The work was supported by the Research Grants Council of the Hong Kong Special Administrative Region under Grants CityU 21200816 and CityU 11203217.

Keywords

  • AdaBoost regression
  • document embedding
  • feature learning
  • Global pathway search
  • random walks

Fingerprint

Dive into the research topics of 'PathEmb : Random Walk based Document Embedding for Global Pathway Similarity Search'. Together they form a unique fingerprint.

Cite this