Clustering is concerned with the discovery of interesting groupings of records in a database. Of the many algorithms have been developed to tackle clustering problems in a variety of application domains, a lot of effort has been put into the development of effective algorithms for handling spatial data. These algorithms were originally developed to handle continuous-valued attributes, and the distance functions such as the Euclidean distance measure are often used to measure the pair-wise similarity/distance between records so as to determine the cluster memberships of records. Since such distance functions cannot be validly defined in non-Euclidean space, these algorithms therefore cannot be used to handle databases that contain discrete-valued data. Owing to the fact that data in the real-life databases are always described by a set of descriptive attributes, many of which are not numerical or inherently ordered in any way, it is important that a clustering algorithm should be developed to handle data mining tasks involving them. In this paper, we propose an effective evolutionary clustering algorithm for this problem. For performance evaluation, we have tested the proposed algorithm using several real data sets. Experimental results show that it outperforms the existing algorithms commonly used for discrete-valued data clustering, and also, when dealing with mixed continuous- and discrete-valued data, its performance is also promising. © 2008 IEEE.
|Title of host publication
|2008 IEEE Congress on Evolutionary Computation, CEC 2008
|Number of pages
|Published - Jun 2008