In this paper, we aim at discovering genetic factors of psoriasis through searching for statistically significant SNP-SNP interactions exhaustively from two real psoriasis genome-wide association study datasets (phs000019.v1.p1 and phs000982.v1.p1) downloaded from the database of Genotypes and Phenotypes. To deal with the enormous search space, our search algorithm is accelerated with eight biological plausible interaction patterns and a pre-computed look-up table. After our search, we have discovered several SNPs having a stronger association to psoriasis when they are in combination with another SNP and these combinations may be non-linear interactions. Among the top 20 SNP-SNP interactions being found in terms of pairwise p-value and improvement metric value, we have discovered 27 novel potential psoriasis-associated SNPs where most of them are reported to be eQTLs of a number of known psoriasis-associated genes. On the other hand, we have inferred a gene network after selecting the top 10000 SNP-SNP interactions in terms of improvement metric value and we have discovered a novel long distance interaction between XXbac-BPG154L12.4 and RNU6-283P which is not a long distance haplotype and may be a new discovery. Finally, our experiments with the synthetic datasets have shown that our pre-computed look-up table technique can significantly speed up the search process.
Bibliographical noteDatasets phs000019.v1.p1. and phs000982.v1.p1. used for the analyses described in this paper were obtained from the database of Genotypes and Phenotypes (dbGaP). Dataset phs000019.v1.p1 was collected by Dr. James T. Elder (University of Michigan, Ann Arbor, MI), Gerald G. Krueger (University of Utah, Salt Lake City, UT), Anne Bowcock (Washington University, St. Louis, MO) and Gonçalo R. Abecasis (University of Michigan, Ann Arbor, MI). Data collection was funded by the National Institutes of Health, the Foundation for the National Institutes of Health, and the National Psoriasis Foundation. Support for genotyping of samples was provided through the Genetic Association Information Network (GAIN). For a description of the dataset, phenotypes, genotype data and quality control procedures see Nair et al. (2009) Nature Genetics 41:200-204. Dataset phs000982.v1.p1 was collected by James T. Elder, University of Michigan, with collaborators Dr. Dafna Gladman, University of Toronto and Dr. Proton Rahman, Memorial University of Newfoundland, providing samples. Data collection was supported by grants from the National Institutes of Health, the Canadian Institute for Health Research, and the Krembil Foundation. Additional support was provided by the Babcock Memorial Trust and by the Barbara and Neal Henschel Charitable Foundation. JTE is supported by the Ann Arbor Veterans Affairs Hospital.
Kwan-Yeung Lee conducted the project and developed the the algorithms used in this study under the supervision of Man-Hon Wong and Kwong-Sak Leung and interpreted the biological meaning of the experimental results under the guidance of Nelson L.S. Tang who pioneered the interaction models. Kwan-Yeung Lee also ran the experiments and wrote the main manuscript text. All authors discussed the results and reviewed the manuscript thoroughly.