Part1 -

For the dataset “iris” dataset within R

  1. Perform k-means clustering on the scaled dataset using the elbow method to find the appropriate number of clusters
    .

k.max = 15
wss1 <- sapply(2:k.max, 
              function(k){kmeans(data1, k, nstart=50,iter.max = 15 )$tot.withinss})
wss1
##  [1] 220.87929 138.88836 113.33162  90.20190  79.46523  70.18758  62.06894
##  [8]  54.68037  46.92474  42.42621  39.95521  36.27638  34.17968  32.70308
plot(2:k.max, wss1,
     type="b", pch = 19, frame = T, 
     xlab="Number of clusters K",
     ylab="Total within-clusters sum of squares")

  1. Plot the clusters obtained from k-means clustering algorithm.
    .

set.seed(1234)

kc1 = kmeans(data1, centers = 3, nstart = 30)
table(kc1$cluster)
## 
##  1  2  3 
## 50 53 47
autoplot(kc1, data = data1, frame = T)

  1. Perform hierarchical clustering on scaled dataset using either average or ward.D2 linkage and plot the dendrogram. Also, make a heatmap annotated with clustering results.

#HC
d.iris= dist(scale(data1))
hc_avg1= hclust(d.iris, method = 'average')
plot(hc_avg1)

#Heatmap
pheatmap(data1, scale = "row")