Samir Brahim Belhaouari

Associate Professor, Hamad Bin Khallifa Uniersity

sbelhaouari [AT] hbku.edu.qa

Divide Well to Merge Better: A Novel Clustering Algorithm

An Enhanced Non-Parametric Approach for Optimized Clustering

Abstract. In this paper, we propose a novel non-parametric clustering algorithm based on a divide-and-merge strategy. The Division phase optimizes the number of sub-clusters using an enhanced K-means algorithm, calculating variance reductions to determine optimal splits: \(D_{c_j}(k) = \left\{ \frac{|V_{c_j}(k+1) - V_{c_j}(k)|}{V_{c_j}(k)} : k = 1, 2, \ldots, l - 1 \right\}\) and \(O_j = \arg\max_{j} \left\{ k : D_{c_j}(k) \ge T_{\text{init}} \right\}\). The Merging phase leverages Gaussian Mixture Models to estimate joint probability densities of projected data points, \(f_{C_J}(x) = \sum_{i=1}^{G} \frac{w_i}{\sigma_i \sqrt{2\pi}} e^{-\frac{(x - \mu_i)^2}{2\sigma_i^2}}\) with \(\sum_{i=1}^{G} w_i = 1\), and calculates the overlap region for merging decisions. Extensive evaluation on 20 benchmark datasets, including synthetic and real data, demonstrates the algorithm’s superiority in discovering clusters of varying shapes and densities, outperforming state-of-the-art methods.
    

Illustration of the proposed experimental design.