Distinguishing coding from non-coding sequences in a prokaryote complete genome based on the global descriptor

Guo Sheng HAN, Zu Guo YU*, Vo ANH, Raymond H. CHAN

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

4 Citations (Scopus)

Abstract

Recognition of coding sequences in a complete genome is an important problem in DNA sequence analysis. Their rapid and accurate recognition contributes to various relevant research and application. In this paper, we aim to distinguish the coding sequences from the non-coding sequences in a prokaryote complete genome. We select a data set of 51 available bacterial genomes. Then, we use the global descriptor method on the coding/non-coding primary sequences and obtain 36 parameters for each coding/non-coding primary sequence. These parameters are used to generate some spaces, whose points represent coding/non-coding sequences in our selected data set. In order to evaluate this method, we perform Fisher's linear discriminant algorithm on it and get relative satisfactory discriminant accuracies. The average accuracies of the global descriptor method (36 parameters) for the training and test sets are 97.81% and 97.49%, respectively. Finally, a comparison with Z curve methods using the same data set is undertaken. When we combine our method with the Z curve method, higher accuracies are obtained. This good performance indicates that the global descriptor method of this paper may complement the existing methods for the gene finding problem.

Original languageEnglish
Title of host publication6th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009
EditorsYixin CHEN, Hepu DENG, Degan ZHANG, Yingyuan XIAO
PublisherIEEE
Pages42-46
Number of pages5
ISBN (Print)9780769537351
DOIs
Publication statusPublished - Aug 2009
Externally publishedYes
Event6th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009 - Tianjin, China
Duration: 14 Aug 200916 Aug 2009

Conference

Conference6th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009
Country/TerritoryChina
CityTianjin
Period14/08/0916/08/09

Keywords

  • coding/noncoding DNA
  • prokaryote genome
  • global descriptor

Fingerprint

Dive into the research topics of 'Distinguishing coding from non-coding sequences in a prokaryote complete genome based on the global descriptor'. Together they form a unique fingerprint.

Cite this