Segment Based Decision Tree Induction with Continuous Valued Attributes

Ran WANG, Sam KWONG, Xi-Zhao WANG, Qingshan JIANG

Research output: Journal PublicationsJournal Article (refereed)peer-review

58 Citations (Scopus)

Abstract

A key issue in decision tree (DT) induction with continuous valued attributes is to design an effective strategy for splitting nodes. The traditional approach to solving this problem is adopting the candidate cut point (CCP) with the highest discriminative ability, which is evaluated by some frequency based heuristic measures. However, such methods ignore the class permutation of examples in the node, and they cannot distinguish the CCPs with the same or similar frequency information, thus may fail to induce a better and smaller tree. In this paper, a new concept, i.e., segment of examples, is proposed to differentiate the CCPs with same frequency information. Then, a new hybrid scheme that combines the two heuristic measures, i.e., frequency and segment, is developed for splitting DT nodes. The relationship between frequency and the expected number of segments, which is regarded as a random variable, is also given. Experimental comparisons demonstrate that the proposed scheme is not only effective to improve the generalization capability, but also valid to reduce the size of the tree.

Original languageEnglish
Article number6912950
Pages (from-to)1262-1275
Number of pages14
JournalIEEE Transactions on Cybernetics
Volume45
Issue number7
Early online date29 Sept 2014
DOIs
Publication statusPublished - Jul 2015
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2013 IEEE.

Funding

This work was supported by the National Natural Science Foundation of China under Grant 61272289, Grant 61175123, and Grant 61170040. This paper was recommended by Associate Editor J. Basak.

Keywords

  • Classification
  • continuous valued attributes
  • decision tree (DT) induction
  • segment

Fingerprint

Dive into the research topics of 'Segment Based Decision Tree Induction with Continuous Valued Attributes'. Together they form a unique fingerprint.

Cite this