Discretization of continuous-valued attributes in decision tree generation

Wen-Liagn LI, Rui-Hua YU, Xi-Zhao WANG

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review

7 Citations (Scopus)

Abstract

Decision tree is one of the most popular and widely used classification models in machine learning. The discretization of continuous-valued attributes plays an important role in decision tree generation. In this paper, we improve Fayyad's discretization method which uses the average class entropy of candidate partitions to select boundaries for discretization. Our method can reduce the number of candidate boundaries further. Here we also propose a generalized splitting criterion for cut point selection and prove that the cut points are always on boundaries when using this criterion. Along with the formal proof, we present empirical results that the decision trees generated by using such criteria are similar on several datasets from the UCI Machine Learning Repository.

Original languageEnglish
Title of host publicationProceedings : 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
PublisherIEEE
Pages194-198
Number of pages5
ISBN (Electronic)9781424465279
ISBN (Print)9781424465262
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010 - Qingdao, China
Duration: 11 Jul 201014 Jul 2010

Publication series

NameInternational Conference on Machine Learning and Cybernetics (ICMLC)
PublisherIEEE
ISSN (Print)2160-133X
ISSN (Electronic)2160-1348

Conference

Conference2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
Country/TerritoryChina
CityQingdao
Period11/07/1014/07/10

Bibliographical note

This research is supported by the national natural science foundation of China (60903088, 60903089), by the natural science foundation of Hebei Province (F2009000227, F2008000635), by the key project foundation of applied fundamental research of Hebei Province (08963522D).

Keywords

  • Continuous-valued
  • Decision tree
  • Discretization
  • Splitting criterion

Fingerprint

Dive into the research topics of 'Discretization of continuous-valued attributes in decision tree generation'. Together they form a unique fingerprint.

Cite this