K-mer natural vector and its application to the phylogenetic analysis of genetic sequences

Jia WEN, Raymond H.F. CHAN, Shek Chung YAU, Rong L. HE, Stephen S.T. YAU*

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

46 Citations (Scopus)

Abstract

Based on the well-known k-mer model, we propose a k-mer natural vector model for representing a genetic sequence based on the numbers and distributions of k-mers in the sequence. We show that there exists a one-to-one correspondence between a genetic sequence and its associated k-mer natural vector. The k-mer natural vector method can be easily and quickly used to perform phylogenetic analysis of genetic sequences without requiring evolutionary models or human intervention. Whole or partial genomes can be handled more effective with our proposed method. It is applied to the phylogenetic analysis of genetic sequences, and the obtaining results fully demonstrate that the k-mer natural vector method is a very powerful tool for analysing and annotating genetic sequences and determining evolutionary relationships both in terms of accuracy and efficiency.

Original languageEnglish
Pages (from-to)25-34
Number of pages10
JournalGene
Volume546
Issue number1
DOIs
Publication statusPublished - 1 Aug 2014
Externally publishedYes

Funding

We thank Dr. Max Benson for critically reading and editing our manuscript. This work is supported by Youth Funding of Suihua University ( KQ1202004 , KQ1202002 ), Scientific Research Funding of Heilongjiang Education Department ( 12513097 ), U.S. NSF grant ( DMS-1120824 , 1119612 ), NIH grant ( 5 SC3 GM098180-04 ), China NSF grant ( 31271408 ), Tsinghua University start up funding, and Tsinghua University independent research project grant.

Keywords

  • K-mer model
  • Natural vector
  • Phylogenetic analysis

Fingerprint

Dive into the research topics of 'K-mer natural vector and its application to the phylogenetic analysis of genetic sequences'. Together they form a unique fingerprint.

Cite this