Minimum Sum of Square Clustering
Minimum Sum-of-Squares Clustering (MSSC), also known as k-means clustering, aims to partition data points into clusters minimizing the sum of squared distances to cluster centers. Current research focuses on improving MSSC's scalability and accuracy, particularly for large datasets ("big data") and semi-supervised scenarios incorporating prior knowledge (e.g., must-link/cannot-link constraints). This involves developing novel parallel algorithms, leveraging techniques like semidefinite programming and memetic differential evolution, and exploring hybrid approaches combining different optimization strategies to enhance both computational efficiency and solution quality. Advances in MSSC have significant implications for various fields, improving the efficiency and accuracy of data analysis and machine learning applications.