A compression algorithm for DNA sequences: Using approximate matching for better compression ratio to reveal the true characteristics of DNA

Xin CHEN, Sam KWONG*, Ming LI

*Corresponding author for this work

Research output: Journal PublicationsComment / Debate Research

75 Citations (Scopus)

Abstract

We present a DNA compression algorithm, GenCompress, based on approximate matching that gives the best compression results on standard benchmark DNA sequences. We present the design rationale of GenCompress based on approximate matching, discuss details of the algorithm, provide experimental results, and compare the results with the two most effective compression algorithms for DNA sequences (Biocompress-2 and Cfact).
Original languageEnglish
Pages (from-to)61-66
Number of pages6
JournalIEEE Engineering in Medicine and Biology Magazine
Volume20
Issue number4
DOIs
Publication statusPublished - 2001
Externally publishedYes

Bibliographical note

Funding Information:
This work was supported in part by the City University of Hong Kong Grant No. 7000875, NSERC Research Grant OGP0046506, CITO, a CGAT grant, and the Steacie Fellowship. Chen Xin’s work was fully supported by the City Univer sity Research Grant 7000875, and Ming Li’s work was partly done while he was visiting City University of Hong Kong. We wish to thank Professor Qing-Yun Shi for her generous support; Fariza Tahi for providing the Biocompress-2 program; and Jonathan Badger, Kevin Lanctot, and Huaichun Wang for providing some data and very helpful comments (Jonathan provided references).

Fingerprint

Dive into the research topics of 'A compression algorithm for DNA sequences: Using approximate matching for better compression ratio to reveal the true characteristics of DNA'. Together they form a unique fingerprint.

Cite this