Abstract
Semi-supervised clustering with instance-level constraints is one of the most active research topics in the areas of pattern recognition, machine learning and data mining. Several recent studies have shown that instance-level constraints can significantly increase accuracies of a variety of clustering algorithms. However, instance-level constraints may split the search space of the optimal clustering solution into pieces, thus significantly compound the difficulty of the search task. This paper explores a genetic approach to solve the problem of semi-supervised clustering with instance-level constraints. In particular, a novel semi-supervised clustering algorithm with instance-level constraints, termed as the hybrid geneticguided semi-supervised clustering algorithm with instancelevel constraints (Cop-HGA), is proposed. Cop-HGA uses a hybrid genetic algorithm to perform the search task of a high quality clustering solution that is able to draw a good balance between predefined clustering criterion and available instance-level background knowledge. The effectiveness of Cop-HGA is confirmed by experimental results on several real data sets with artificial instance-level constraints. Copyright 2008 ACM.
| Original language | English |
|---|---|
| Title of host publication | GECCO'08: Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation 2008 |
| Editors | Maarten KEIJZER |
| Publisher | Association for Computing Machinery (ACM) |
| Pages | 1381-1388 |
| Number of pages | 8 |
| ISBN (Print) | 9781605581309 |
| DOIs | |
| Publication status | Published - Jul 2008 |
| Externally published | Yes |
| Event | 10th Annual Genetic and Evolutionary Computation Conference - Atlanta, United States Duration: 12 Jul 2008 → 16 Jul 2008 |
Conference
| Conference | 10th Annual Genetic and Evolutionary Computation Conference |
|---|---|
| Abbreviated title | GECCO 2008 |
| Country/Territory | United States |
| City | Atlanta |
| Period | 12/07/08 → 16/07/08 |
Keywords
- Genetic algorithms
- Semi-supervised clustering