Abstract
Data clustering is a good benchmark problem for testing the performance of many combinatory optimization methods. However, very few works have been done on using the estimation of distribution algorithms for solving the problem of data clustering. The purpose of this paper is to demonstrate the effectiveness of the estimation of distribution algorithms for solving the problem of data clustering. In particular, a novel encoding strategy termed as the Similarity Matrix Encoding strategy (SME) and a Virtual Population Based Incremental Learning algorithm using SME encoding strategy (VPBIL-SME) are proposed for clustering a set of unlabeled instances into groups. Effectiveness of VPBIL-SME is confirmed by experimental results on several real data sets.
Original language | English |
---|---|
Title of host publication | GECCO'08: Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation 2008 |
Pages | 471-472 |
Publication status | Published - 2008 |
Externally published | Yes |
Keywords
- Data clustering
- Similarity matrix encoding strategy
- Virtual population based incremental learning algorithm