GP-Pi : using genetic programming with penalization and initialization on genome-wide association study

Ho Yin SZE-TO*, Kwan-Yeung LEE, Kai-Yuen TSO, Man-Hon WONG, Kin-Hong LEE, Nelson L.S. TANG*, Kwong-Sak LEUNG*

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

1 Citation (Scopus)

Abstract

The advancement of chip-based technology has enabled the measurement of millions of DNA sequence variations across the human genome. Experiments revealed that high-order, but not individual, interactions of single nucleotide polymorphisms (SNPs) are responsible for complex diseases such as cancer. The challenge of genome-wide association studies (GWASs) is to sift through high-dimensional datasets to find out particular combinations of SNPs that are predictive of these diseases. Genetic Programming (GP) has been widely applied in GWASs. It serves two purposes: attribute selection and/or discriminative modeling. One advantage of discriminative modeling over attribute selection lies in interpretability. However, existing discriminative modeling algorithms do not scale up well with the increase in the SNP dimension. Here, we have developed GP-Pi. We have introduced a penalizing term in the fitness function to penalize trees with common SNPs and an initializer which utilizes expert knowledge to seed the population with good attributes. Experimental results on simulated data suggested that GP-Pi outperforms GPAS with statistically significance. GP-Pi was further evaluated on a real GWAS dataset of Rheumatoid Arthritis, obtained from the North American Rheumatoid Arthritis Consortium. Our results, with potential new discoveries, are found to be consistent with literature.

Original languageEnglish
Title of host publicationArtificial Intelligence and Soft Computing 12th International Conference, ICAISC 2013, Zakopane, Poland, June 9-13, 2013, Proceedings, Part II
EditorsLeszek RUTKOWSKI, Marcin KORYTKOWSKI, Rafał SCHERER, Ryszard TADEUSIEWICZ, Lotfi A. ZADEH, Jacek M. ZURADA
PublisherSpringer
Pages330-341
Number of pages12
ISBN (Print)9783642386091
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event12th International Conference on Artificial Intelligence and Soft Computing, ICAISC 2013 - Zakopane, Poland
Duration: 9 Jun 201313 Jun 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume7895 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th International Conference on Artificial Intelligence and Soft Computing, ICAISC 2013
Country/TerritoryPoland
CityZakopane
Period9/06/1313/06/13

Keywords

  • Genetic Programming
  • Genome-Wide Association Study
  • Initialization
  • Penalization
  • Rheumatoid Arthritis

Fingerprint

Dive into the research topics of 'GP-Pi : using genetic programming with penalization and initialization on genome-wide association study'. Together they form a unique fingerprint.

Cite this