Clustering Analysis, Part II: K-Means Clustering
Monday, December 21st, 2009In this continuing series, we explore the NMath Stats functions for performing cluster analysis in .NET. (For previous posts, see here.) The sample data set we’re using classifies 89 single malt scotch whiskies on a five-point scale (0-4) for 12 flavor characteristics. To visualize the data set and clusterings, we make use of the free Microsoft Chart Controls for .NET, which provide a basic set of charts.
In this post, we’ll cluster the scotches based on “similarity” in the original 12-dimensional flavor space using k-means clustering. The k-means clustering method assigns data points into k groups such that the sum of squares from points to the computed cluster centers is minimized. In NMath Stats, class KMeansClustering performs k-means clustering.
(more…)
