Projects per year
Abstract
K-Means algorithm is a popular clustering method. However, it has two limitations: 1) it gets stuck easily in spurious local minima, and 2) the number of clusters k has to be given a priori. To solve these two issues, a multi-prototypes convex merging based K-Means clustering algorithm (MCKM) is presented. First, based on the structure of the spurious local minima of the K-Means problem, a multi-prototypes sampling (MPS) is designed to select the appropriate number of multi-prototypes for data with arbitrary shapes. Then, a merging technique, called convex merging (CM), merges the multi-prototypes to get a better local minima without k being given a priori. Specifically, CM can obtain the optimal merging and estimate the correct k. By integrating these two techniques with K-Means algorithm, the proposed MCKM is an efficient and explainable clustering algorithm for escaping the undesirable local minima of K-Means problem without given k first. Two theoretical proofs are given to guarantee that the cost of MCKM (MPS+CM) can achieve a constant factor approximation to the optimal cost of the K-Means problem. Experimental results performed on synthetic and real-world data sets have verified the effectiveness of the proposed algorithm.
Original language | English |
---|---|
Pages (from-to) | 6653-6666 |
Number of pages | 14 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 36 |
Issue number | 11 |
Early online date | 13 Dec 2023 |
DOIs | |
Publication status | Published - Nov 2024 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Funding
This work was supported by the National Natural Science Foundation of China under Grants No. 61772020; HKRGC Grants Nos. CUHK14301718, CityU11301120, and C1013-21GF; and CityU Grant 9380101.
Keywords
- Convex merging
- K-means
- multi-prototypes
- multi-prototypes sampling
Fingerprint
Dive into the research topics of 'Multi-Prototypes Convex Merging Based K-Means Clustering Algorithm'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Novel Computational Methods for Three-dimensional Point Source Localization based on Point Spread Function Analytics (基於點擴散函數的三維點光源定位新型算法)
CHAN, R. (PI), WANG, C. (CoI), PRASAD, S. (CoI) & PLEMMONS, R. J. (CoI)
Research Grants Council (HKSAR)
1/01/21 → 30/06/24
Project: Grant Research