Sparse approximation through boosting for learning large scale kernel machines

Ping SUN, Xin YAO

Research output: Journal PublicationsJournal Article (refereed)peer-review

35 Citations (Scopus)


Recently, sparse approximation has become a preferred method for learning large scale kernel machines. This technique attempts to represent the solution with only a subset of original data points also known as basis vectors, which are usually chosen one by one with a forward selection procedure based on some selection criteria. The computational complexity of several resultant algorithms scales as O(NM2)in time and O NM in memory, where N is the number of training points and M is the number of basis vectors as well as the steps of forward selection. For some large scale data sets, to obtain a better solution, we are sometimes required to include more basis vectors, which means that M is not trivial in this situation. However, the limited computational resource (e.g., memory) prevents us from including too many vectors. To handle this dilemma, we propose to add an ensemble of basis vectors instead of only one at each forward step. The proposed method, closely related to gradient boosting, could decrease the required number $M$ of forward steps significantly and thus a large fraction of computational cost is saved. Numerical experiments on three large scale regression tasks and a classification problem demonstrate the effectiveness of the proposed approach. © 2006 IEEE.
Original languageEnglish
Article number5451128
Pages (from-to)883-894
Number of pages12
JournalIEEE Transactions on Neural Networks
Issue number6
Early online date20 Apr 2010
Publication statusPublished - Jun 2010
Externally publishedYes

Bibliographical note

The work of P. Sun was supported by an ORS Award. The work of X. Yao was supported by the EPSRC grant GR/T10671/01.


  • Boosting
  • Forward selection
  • Kernel machines
  • Large scale data mining
  • Large scale problems
  • Sparsification


Dive into the research topics of 'Sparse approximation through boosting for learning large scale kernel machines'. Together they form a unique fingerprint.

Cite this