Assignment 12

Part1 -

For the dataset “iris” dataset within R

Perform k-means clustering on the scaled dataset using the elbow method to find the appropriate number of clusters
.

k.max = 15
wss1 <- sapply(2:k.max, 
              function(k){kmeans(data1, k, nstart=50,iter.max = 15 )$tot.withinss})
wss1

##  [1] 220.87929 138.88836 113.33162  90.20190  79.46523  70.18758  62.06894
##  [8]  54.68037  46.92474  42.42621  39.95521  36.27638  34.17968  32.70308

plot(2:k.max, wss1,
     type="b", pch = 19, frame = T, 
     xlab="Number of clusters K",
     ylab="Total within-clusters sum of squares")

Plot the clusters obtained from k-means clustering algorithm.
.

set.seed(1234)

kc1 = kmeans(data1, centers = 3, nstart = 30)
table(kc1$cluster)

## 
##  1  2  3 
## 50 53 47

autoplot(kc1, data = data1, frame = T)

Perform hierarchical clustering on scaled dataset using either average or ward.D2 linkage and plot the dendrogram. Also, make a heatmap annotated with clustering results.

#HC
d.iris= dist(scale(data1))
hc_avg1= hclust(d.iris, method = 'average')
plot(hc_avg1)

#Heatmap
pheatmap(data1, scale = "row")

Assignment 12

Eliana Almazan

2023-04-05