ICN : a normalization method for gene expression data considering the over-expression of informative genes

Lixin CHENG*, Xuan WANG, Pak-Kan WONG, Kwan-Yeung LEE, Le LI, Bin XU, Dong WANG*, Kwong-Sak LEUNG*

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

25 Citations (Scopus)

Abstract

The global increase of gene expression has been frequently established in cancer microarray studies. However, many genes may not deliver informative signals for a given experiment, due to insufficient expression or even non-expression, despite the DNA microarrays massively measuring genes in parallel. Hence the informative gene set, rather than the whole genome, should be more reasonable to represent the genome expression level. We observed that the trend of over-expression for informative genes is more obvious in human cancers, which is to some extent masked using the whole genome without any filtering. Accordingly we proposed a novel normalization method, Informative CrossNorm (ICN), which performs the cross normalization (CrossNorm) on the expression matrix merely containing the informative genes. ICN outperforms other methods with a consistently high precision, F-score, and Matthews correlation coefficient as well as an acceptable recall based on three available spiked-in datasets with ground truth. In addition, nine potential therapeutic target genes for esophageal squamous cell carcinoma (ESCC) were identified using ICN integrated with a protein-protein interaction network, which biologically demonstrates that ICN shows superior performance. Consequently, it is expected that ICN could be applied routinely in cancer microarray studies.

Original languageEnglish
Pages (from-to)3057-3066
Number of pages10
JournalMolecular BioSystems
Volume12
Issue number10
Early online date18 Jul 2016
DOIs
Publication statusPublished - 2016
Externally publishedYes

Fingerprint

Dive into the research topics of 'ICN : a normalization method for gene expression data considering the over-expression of informative genes'. Together they form a unique fingerprint.

Cite this