Probabilistic and Graphical Model based Genetic Algorithm Driven Clustering with Instance-level Constraints

Yi HONG, Sam KWONG, Hanli WANG, Qingsheng REN, Yuchou CHANG

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

8 Citations (Scopus)

Abstract

Clustering is traditionally viewed as an unsupervised method for data analysis. However, several recent studies have shown that some limited prior instance-level knowledge can significantly improve the performance of clustering algorithm. This paper proposes a semi-supervised clustering algorithm termed as the Probabilistic and Graphical Model based Genetic Algorithm Driven Clustering with Instance-level Constraints (Cop-CGA). In Cop-CGA, all prior knowledge about pairs of instances that should or should not be classified into the same groups is denoted as a graph and all candidate clustering solutions are sampled from this graph with different orders to assign instances into a certain number of groups. We illustrate how to design the Cop-CGA to guarantee that all candidate solutions satisfy the given constraints and demonstrate the usefulness of background knowledge for genetic algorithm driven clustering algorithm through experiments on several real data sets with artificial hard constraints. One advantage of Cop-CGA is both positive and negative instance-level constraints can be easily incorporated. Moreover, the performance of Cop-CGA is not sensitive to the order of assignment of instances to groups. © 2008 IEEE.
Original languageEnglish
Title of host publication2008 IEEE Congress on Evolutionary Computation, CEC 2008
Pages322-329
DOIs
Publication statusPublished - 2008
Externally publishedYes

Fingerprint

Dive into the research topics of 'Probabilistic and Graphical Model based Genetic Algorithm Driven Clustering with Instance-level Constraints'. Together they form a unique fingerprint.

Cite this