Abstract
Motivation: Traditional sequence distances require an alignment and therefore are not directly applicable to the problem of whole genome phylogeny where events such as rearrangements make full length alignments impossible. We present a sequence distance that works on unaligned sequences using the information theoretical concept of Kolmogorov complexity and a program to estimate this distance. Results: We establish the mathematical foundations of our distance and illustrate its use by constructing a phylogeny of the Eutherian orders using complete unaligned mitochondrial genomes. This phylogeny is consistent with the commonly accepted one for the Eutherians. A second, larger mammalian dataset is also analyzed, yielding a phylogeny generally consistent with the commonly accepted one for the mammals.
Original language | English |
---|---|
Pages (from-to) | 149-154 |
Journal | Bioinformatics |
Volume | 17 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2001 |
Externally published | Yes |
Funding
J.H.B. was supported by a CITO grant. X.C., S.K., and M.L. were supported in part by a CityU research grant 7000 875, P.K. was supported by NSERC Research Grant 160321 and a CITO grant, M.L. was also supported in part by NSERC Research Grant OGP0046506, a CITO grant, and an NSERC Steacie Fellowship. H.Z. was supported by NSERC Research Grants OGP0046506 and 160321.