Evaluating phylogenetic informativeness and data-type usage for new protein-coding genes across Vertebrata

Jonathan J. FONG, Matthew K. FUJITA

Research output: Journal PublicationsJournal Article (refereed)Researchpeer-review

26 Citations (Scopus)

Abstract

As a resource for vertebrate phylogenetics, we developed 75 new protein-coding genes using a combination of expressed sequence tags (ESTs) available in Genbank, and targeted amplification of complementary DNA (cDNA). In addition, we performed three additional analyses in order to assess the utility of our approach. First, we profiled the phylogenetic informativeness of these new markers using the online program PhyDesign. Next, we compared the utility of four different data-types used in phylogenetics: nucleotides (NUCL), amino acids (AA), 1st and 2nd codon positions only (N12), and modified sequences to account for codon degeneracy (DEGEN1; http://www.sciencedirect.com/science/article/pii/S1055790311003058#b0175" id="x-x-x-x-ancbb0175">Regier et al., 2010 ). Lastly, we use these new markers to construct a vertebrate phylogeny and address the uncertain relationship between higher-level mammal groups: monotremes, marsupials, and placentals. Our results show that phylogenetic informativeness of the 75 new markers varies, both in the amount of phylogenetic signal and optimal timescale. When comparing the four data-types, we find that the NUCL data-type, due to the high level of phylogenetic signal, performs the best across all divergence times. The remaining three data-types (AA, N12, DEGEN1) are less subject to homoplasy, but have greatly reduced levels of phylogenetic signal relative to NUCL. Our phylogenetic inference supports the Theria hypothesis of mammalian relationships, with marsupials and placentals being sister groups.
Original languageEnglish
Pages (from-to)300-307
Number of pages8
JournalMolecular Phylogenetics and Evolution
Volume61
Issue number2
DOIs
Publication statusPublished - 1 Nov 2011
Externally publishedYes

Fingerprint

Marsupialia
Nucleotides
vertebrates
phylogenetics
Codon
protein
Vertebrates
gene
phylogeny
Amino Acids
Proteins
genes
proteins
Expressed Sequence Tags
Nucleic Acid Databases
Phylogeny
Mammals
marsupial
Complementary DNA
nucleotides

Cite this

@article{79f8fd9a69604112a2239e926550d780,
title = "Evaluating phylogenetic informativeness and data-type usage for new protein-coding genes across Vertebrata",
abstract = "As a resource for vertebrate phylogenetics, we developed 75 new protein-coding genes using a combination of expressed sequence tags (ESTs) available in Genbank, and targeted amplification of complementary DNA (cDNA). In addition, we performed three additional analyses in order to assess the utility of our approach. First, we profiled the phylogenetic informativeness of these new markers using the online program PhyDesign. Next, we compared the utility of four different data-types used in phylogenetics: nucleotides (NUCL), amino acids (AA), 1st and 2nd codon positions only (N12), and modified sequences to account for codon degeneracy (DEGEN1; http://www.sciencedirect.com/science/article/pii/S1055790311003058#b0175{"} id={"}x-x-x-x-ancbb0175{"}>Regier et al., 2010 ). Lastly, we use these new markers to construct a vertebrate phylogeny and address the uncertain relationship between higher-level mammal groups: monotremes, marsupials, and placentals. Our results show that phylogenetic informativeness of the 75 new markers varies, both in the amount of phylogenetic signal and optimal timescale. When comparing the four data-types, we find that the NUCL data-type, due to the high level of phylogenetic signal, performs the best across all divergence times. The remaining three data-types (AA, N12, DEGEN1) are less subject to homoplasy, but have greatly reduced levels of phylogenetic signal relative to NUCL. Our phylogenetic inference supports the Theria hypothesis of mammalian relationships, with marsupials and placentals being sister groups.",
author = "FONG, {Jonathan J.} and FUJITA, {Matthew K.}",
year = "2011",
month = "11",
day = "1",
doi = "10.1016/j.ympev.2011.06.016",
language = "English",
volume = "61",
pages = "300--307",
journal = "Molecular Phylogenetics and Evolution",
issn = "1055-7903",
publisher = "Academic Press Inc.",
number = "2",

}

Evaluating phylogenetic informativeness and data-type usage for new protein-coding genes across Vertebrata. / FONG, Jonathan J.; FUJITA, Matthew K.

In: Molecular Phylogenetics and Evolution, Vol. 61, No. 2, 01.11.2011, p. 300-307.

Research output: Journal PublicationsJournal Article (refereed)Researchpeer-review

TY - JOUR

T1 - Evaluating phylogenetic informativeness and data-type usage for new protein-coding genes across Vertebrata

AU - FONG, Jonathan J.

AU - FUJITA, Matthew K.

PY - 2011/11/1

Y1 - 2011/11/1

N2 - As a resource for vertebrate phylogenetics, we developed 75 new protein-coding genes using a combination of expressed sequence tags (ESTs) available in Genbank, and targeted amplification of complementary DNA (cDNA). In addition, we performed three additional analyses in order to assess the utility of our approach. First, we profiled the phylogenetic informativeness of these new markers using the online program PhyDesign. Next, we compared the utility of four different data-types used in phylogenetics: nucleotides (NUCL), amino acids (AA), 1st and 2nd codon positions only (N12), and modified sequences to account for codon degeneracy (DEGEN1; http://www.sciencedirect.com/science/article/pii/S1055790311003058#b0175" id="x-x-x-x-ancbb0175">Regier et al., 2010 ). Lastly, we use these new markers to construct a vertebrate phylogeny and address the uncertain relationship between higher-level mammal groups: monotremes, marsupials, and placentals. Our results show that phylogenetic informativeness of the 75 new markers varies, both in the amount of phylogenetic signal and optimal timescale. When comparing the four data-types, we find that the NUCL data-type, due to the high level of phylogenetic signal, performs the best across all divergence times. The remaining three data-types (AA, N12, DEGEN1) are less subject to homoplasy, but have greatly reduced levels of phylogenetic signal relative to NUCL. Our phylogenetic inference supports the Theria hypothesis of mammalian relationships, with marsupials and placentals being sister groups.

AB - As a resource for vertebrate phylogenetics, we developed 75 new protein-coding genes using a combination of expressed sequence tags (ESTs) available in Genbank, and targeted amplification of complementary DNA (cDNA). In addition, we performed three additional analyses in order to assess the utility of our approach. First, we profiled the phylogenetic informativeness of these new markers using the online program PhyDesign. Next, we compared the utility of four different data-types used in phylogenetics: nucleotides (NUCL), amino acids (AA), 1st and 2nd codon positions only (N12), and modified sequences to account for codon degeneracy (DEGEN1; http://www.sciencedirect.com/science/article/pii/S1055790311003058#b0175" id="x-x-x-x-ancbb0175">Regier et al., 2010 ). Lastly, we use these new markers to construct a vertebrate phylogeny and address the uncertain relationship between higher-level mammal groups: monotremes, marsupials, and placentals. Our results show that phylogenetic informativeness of the 75 new markers varies, both in the amount of phylogenetic signal and optimal timescale. When comparing the four data-types, we find that the NUCL data-type, due to the high level of phylogenetic signal, performs the best across all divergence times. The remaining three data-types (AA, N12, DEGEN1) are less subject to homoplasy, but have greatly reduced levels of phylogenetic signal relative to NUCL. Our phylogenetic inference supports the Theria hypothesis of mammalian relationships, with marsupials and placentals being sister groups.

UR - http://commons.ln.edu.hk/sw_master/5253

U2 - 10.1016/j.ympev.2011.06.016

DO - 10.1016/j.ympev.2011.06.016

M3 - Journal Article (refereed)

VL - 61

SP - 300

EP - 307

JO - Molecular Phylogenetics and Evolution

JF - Molecular Phylogenetics and Evolution

SN - 1055-7903

IS - 2

ER -