Genetic algorithm for dimer-led and error-restricted spaced motif discovery

Tak Ming CHAN, Leung Yau LO, Man Leung WONG, Yong LIANG, Kwong Sak LEUNG

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

Abstract

DNA motif discovery is an important problem for deciphering protein-DNA bindings in gene regulation. To discover generic spaced motifs which have multiple conserved patterns separated by wild-cards called spacers, the genetic algorithm (GA) based GASMEN has been proposed and shown to outperform related methods. However, the over-generic modeling of any number of spacers increases the optimization difficulty in practice. In protein-DNA binding case studies, complicated spaced motifs are rare while dimers with single spacers are more common spaced motifs. Moreover, errors (mismatches) in a conserved pattern are not arbitrarily distributed as certain highly conserved nucleotides are essential to maintain bindings. Motivated by better optimization in real applications, we have developed a new method, which is GA for Dimer-led and Error-restricted Spaced Motifs (GADESM). Common spaced motifs are paid special attention to using dimer-led initialization in the population initialization. The results on real datasets show that the dimer-led initialization in GADESM achieves better fitness than GASMEN with statistical significance. With additional error-restricted motif occurrence retrieval, GADESM has shown better performance than GASMEN on both comprehensive simulation data and a real ChIP-seq case study.
Original languageEnglish
Title of host publicationProceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages198-205
Number of pages8
ISBN (Print)9781467358750
DOIs
Publication statusPublished - 12 Sep 2013

    Fingerprint

Bibliographical note

Paper presented at the 10th Annual IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Apr 16-19, 2013, Singapore.

Cite this

CHAN, T. M., LO, L. Y., WONG, M. L., LIANG, Y., & LEUNG, K. S. (2013). Genetic algorithm for dimer-led and error-restricted spaced motif discovery. In Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013 (pp. 198-205). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CIBCB.2013.6595409