Author

K-means clustering is a method for identifying subgroups i.e. clusters and cluster centers in a set of unlabelled data.

Kmeans algorithm works as follows:

Consider we gave initial unlabelled data of K centers.

- Randomly select K data points.
- For each data point, its cluster is identified (i e all the data points that are close to center).
- Keep iterating until none of the clusters remains unstable by calculating the average of all the points assigned to the cluster and get a new centroid.

Implementation:

```
from numpy import vstack,array
from numpy.random import rand
from scipy.cluster.vq import kmeans,vq,whiten
# data generation with three features
data = vstack((rand(100,3) + array([.5,.5,.5]),rand(100,3)))
# whitening of data
data1 = whiten(data)
# computing K-Means with clusters
centroids,_ = kmeans(data1,3)
print(centroids)
```

Output:

```
[[1.19175347 0.98152245 1.89185956]
[2.70493124 2.64803832 2.62024853]
[1.51076247 1.50606114 0.61546322]]
```