Evaluating phylogenetic informativeness and data-type usage for new protein-coding genes across Vertebrata

Jonathan J. FONG, Matthew K. FUJITA

Research output: Journal PublicationsJournal Article (refereed)peer-review

30 Citations (Scopus)


As a resource for vertebrate phylogenetics, we developed 75 new protein-coding genes using a combination of expressed sequence tags (ESTs) available in Genbank, and targeted amplification of complementary DNA (cDNA). In addition, we performed three additional analyses in order to assess the utility of our approach. First, we profiled the phylogenetic informativeness of these new markers using the online program PhyDesign. Next, we compared the utility of four different data-types used in phylogenetics: nucleotides (NUCL), amino acids (AA), 1st and 2nd codon positions only (N12), and modified sequences to account for codon degeneracy (DEGEN1; http://www.sciencedirect.com/science/article/pii/S1055790311003058#b0175" id="x-x-x-x-ancbb0175">Regier et al., 2010 ). Lastly, we use these new markers to construct a vertebrate phylogeny and address the uncertain relationship between higher-level mammal groups: monotremes, marsupials, and placentals. Our results show that phylogenetic informativeness of the 75 new markers varies, both in the amount of phylogenetic signal and optimal timescale. When comparing the four data-types, we find that the NUCL data-type, due to the high level of phylogenetic signal, performs the best across all divergence times. The remaining three data-types (AA, N12, DEGEN1) are less subject to homoplasy, but have greatly reduced levels of phylogenetic signal relative to NUCL. Our phylogenetic inference supports the Theria hypothesis of mammalian relationships, with marsupials and placentals being sister groups.
Original languageEnglish
Pages (from-to)300-307
Number of pages8
JournalMolecular Phylogenetics and Evolution
Issue number2
Publication statusPublished - 1 Nov 2011
Externally publishedYes


Dive into the research topics of 'Evaluating phylogenetic informativeness and data-type usage for new protein-coding genes across Vertebrata'. Together they form a unique fingerprint.

Cite this