Profiling MS proteomics data using smoothed non-linear energy operator and Bayesian additive regression trees

Shan HE, Xiaoli LI, Mark R. VIANT, Xin YAO

Research output: Journal PublicationsJournal Article (refereed)peer-review

6 Citations (Scopus)


This paper proposes a novel profiling method for SELDI-TOF and MALDI-TOF MS data that integrates a novel peak detection method based on modified smoothed non-linear energy operator, correlation-based peak selection and Bayesian additive regression trees. The peak detection and classification performance of the proposed approach is validated on two publicly available MS data sets, namely MALDI-TOF simulation data and high-resolution SELDI-TOF ovarian cancer data. The results compared favorably with three state-of-the-art peak detection algorithms and four machine-learning algorithms. For the high-resolution ovarian cancer data set, seven biomarkers (m/z windows) were found by our method, which achieved 97.30 and 99.10% accuracy at 25th and 75th percentiles, respectively, from 50 independent cross-validation samples, which is significantly better than other profiling and dimensional reduction methods. The results show that the method is capable of finding parsimonious sets of biologically meaningful biomarkers with better accuracy than existing methods. Supporting Information material and MATLAB/R scripts to implement the methods described in the article are available at: © 2009 Wiley-VCH Verlag GmbH & Co. KGaA.
Original languageEnglish
Pages (from-to)4176-4191
Number of pages16
Issue number17
Early online date31 Aug 2009
Publication statusPublished - Sept 2009
Externally publishedYes


  • Bioinformatics
  • Cancer diagnosis
  • Machine learning
  • MS
  • Peak detection


Dive into the research topics of 'Profiling MS proteomics data using smoothed non-linear energy operator and Bayesian additive regression trees'. Together they form a unique fingerprint.

Cite this