Various data about wine is analyzed and clustered using the k-means unsupervised learning technique. The data is grouped into 3 different groups and displayed on a scatter plot. The analysis compares the mallic acid levels and the ash levels.
data <- read.csv("wine.csv", header = FALSE)
df <- data.frame(data$V2, data$V3)
cl <- kmeans(df, 3)
cl$centers
## data.V2 data.V3
## 1 12.21349 1.653175
## 2 13.71538 1.799692
## 3 13.06320 3.894800
table(cl$cluster)
##
## 1 2 3
## 63 65 50
plot(df$data.V2, df$data.V3, col = cl$cluster, xlab = "Mallic Acid Levels", ylab = "Ash Levels", main = "Ash vs Mallic Acid Clustering")