DNA sequencing technologies : Sequencing data protocols and bioinformatics tools

Ka-Chun WONG, Jiao ZHANG, Shankai YAN, Xiangtao LI, Qiuzhen LIN, Sam KWONG, Cheng LIANG

Research output: Journal PublicationsJournal Article (refereed)peer-review

10 Citations (Scopus)

Abstract

The recent advances in DNA sequencing technology, from first-generation sequencing (FGS) to third-generation sequencing (TGS), have constantly transformed the genome research landscape. Its data throughput is unprecedented and severalfold as compared with past technologies. DNA sequencing technologies generate sequencing data that are big, sparse, and heterogeneous. This results in the rapid development of various data protocols and bioinformatics tools for handling sequencing data. In this review, a historical snapshot of DNA sequencing is taken with an emphasis on data manipulation and tools. The technological history of DNA sequencing is described and reviewed in thorough detail. To manipulate the sequencing data generated, different data protocols are introduced and reviewed. In particular, data compression methods are highlighted and discussed to provide readers a practical perspective in the real-world setting. A large variety of bioinformatics tools are also reviewed to help readers extract the most from their sequencing data in different aspects, such as sequencing quality control, genomic visualization, single-nucleotide variant calling, INDEL calling, structural variation calling, and integrative analysis. Toward the end of the article, we critically discuss the existing DNA sequencing technologies for their pitfalls and potential solutions.
Original languageEnglish
Article number98
JournalACM Computing Surveys
Volume52
Issue number5
Early online date13 Sept 2019
DOIs
Publication statusPublished - Oct 2019
Externally publishedYes

Bibliographical note

The work described in this paper was substantially supported by three grants from the Research Grants Council of the Hong Kong Special Administrative Region [CityU 21200816], [CityU 11203217], and [CityU 11200218]. We acknowledge the donation support of the Titan Xp GPU from the NVIDIA Corporation.

Keywords

  • Bioinformatics
  • Computational biology
  • Data protocols
  • DNA sequencing
  • History
  • Software
  • Technology
  • Third-generation sequencing (TGS)
  • Tools

Fingerprint

Dive into the research topics of 'DNA sequencing technologies : Sequencing data protocols and bioinformatics tools'. Together they form a unique fingerprint.

Cite this