- In “supervised learning” the data set already has features derived from the data, and the model learns from these features.
- In “unsupervised learning” there are no predetermined features. The goal is to derive these features that exists in the data.
- Clustering is a unsupervised task where observations are grouped (clusters) so that data points in the same group have a similar feature.
- K-means is the clustering algorithm that is used to group the data points.
We will use the built-in iris dataset as a running example throughout this presentation.