Paper ID: 2412.18760
Data clustering: an essential technique in data science
Wong Hauchi, Daniil Lisik, Tai Dinh
This paper provides a comprehensive exploration of data clustering, emphasizing its methodologies and applications across different fields. Traditional techniques, including partitional and hierarchical clustering, are discussed alongside other approaches such as data stream, subspace and network clustering, highlighting their role in addressing complex, high-dimensional datasets. The paper also reviews the foundational principles of clustering, introduces common tools and methods, and examines its diverse applications in data science. Finally, the discussion concludes with insights into future directions, underscoring the centrality of clustering in driving innovation and enabling data-driven decision making.
Submitted: Dec 25, 2024