---
title: "K-means clustering"
output:
flexdashboard::flex_dashboard:
orientation: rows
social: menu
source_code: embed
self_contained: true
vertical_layout: fill
theme: flatly
---
```{r setup, include=FALSE}
library(flexdashboard)
library(factoextra)
library(cluster)
library(readxl)
# Data preparation
water <- read_excel("water.xlsx", range = "b1:j16")
water <- na.omit(water)
water_data <- water[2:9]
rownames(water_data) <- paste(water$Code)
```
# Sidebar {.sidebar}
**Region**
- `KCN`: Kachin
- `KYH`: Kayah
- `KYN`: Kayin
- `CHN`: Chin
- `SGG`: Sagaing
- `TNI`: Thnintharyi
- `BGO`: Bago
- `MGY`: Magway
- `MDY`: Mandalay
- `MON`: Mon
- `RKE`: Rakhine
- `YGN`: Yangon
- `SHN`: Shan
- `AYY`: Ayeyarwady
- `NPW`: NayPyiTaw
# Descriptive Statistics {.small}
## Row
### Average Silhouette Method
```{r}
fviz_nbclust(water_data, FUN = hcut, method = 'silhouette') +
theme_minimal()+theme(
panel.background = element_rect(fill = "#f0f8ff", color = NA),
plot.background = element_rect(fill = "#f0f8ff", color = NA)
)
```
### Gap Statistic Method
```{r}
fviz_nbclust(water_data, FUN = hcut, method = 'gap_stat') +
theme_minimal()+theme(
panel.background = element_rect(fill = "#f0f8ff", color = NA),
plot.background = element_rect(fill = "#f0f8ff", color = NA)
)
```
# Hierarchical Clustering
### Agglomerative Hierarchical Clustering with "euclidean distance" {data-width="500"}
#### Agglomerative Coefficient (Single Linkage)
```{r}
hc2_single_e <- agnes(water_data, metric = 'euclidean', method = 'single')
hc2_single_e$ac
```
#### Agglomerative Coefficient (Complete Linkage) {.tiny}
```{r}
hc2_complete_e <- agnes(water_data, metric = 'euclidean', method = 'complete')
hc2_complete_e$ac
```
#### Agglomerative Coefficient (Average Linkage) {.tiny}
```{r}
hc2_average_e <- agnes(water_data, metric = 'euclidean', method = 'average')
hc2_average_e$ac
```
#### Agglomerative Coefficient (Ward Linkage) {.tiny}
```{r}
hc2_ward_e <- agnes(water_data, metric = 'euclidean', method = 'ward')
hc2_ward_e$ac
```
### Agglomerative Hierarchical Clustering with "manhattan distance" {data-width="520"}
#### Agglomerative Coefficient (Single Linkage) {.tiny}
```{r}
hc2_single_m <- agnes(water_data, metric = 'manhattan', method = 'single')
hc2_single_m$ac
```
#### Agglomerative Coefficient (Complete Linkage) {.tiny}
```{r}
hc2_complete_m <- agnes(water_data, metric = 'manhattan', method = 'complete')
hc2_complete_m$ac
```
#### Agglomerative Coefficient (Average Linkage) {.tiny}
```{r}
hc2_average_m <- agnes(water_data, metric = 'manhattan', method = 'average')
hc2_average_m$ac
```
#### Agglomerative Coefficient (Ward Linkage) {.tiny}
```{r}
hc2_ward_m <- agnes(water_data, metric = 'manhattan', method = 'ward')
hc2_ward_m$ac
```
# Dendogram of agnes
## Row
```{r}
hc3 <- agnes(water_data, metric = 'euclidean', method = 'ward')
pltree(hc3, cex = 0.6, hang = -1, main = 'Dendrogram of agnes')
```
# Visualize (Hierarchical)
## Row
### Visualization (convex) of Hierarchical Clustering
```{r}
hc.cut <- hcut(water_data, k = 2, hc_func = 'hclust', hc_method = 'complete', hc_metric = 'euclidean')
fviz_cluster(hc.cut, ellipse.type = 'convex', repel = T)
```
### Visualization (ellipse) of Hierarchical Clustering
```{r}
fviz_cluster(hc.cut, ellipse.type = 'norm', repel = T)
```
# Visualize (K-means)
## Row
### Visualization (convex) of K-means Clustering
```{r}
k2 <- kmeans(water_data, centers = 2, nstart = 25)
fviz_cluster(k2, data = water_data, repel = T)
```
### Visualization (ellipse) of K-means Clustering
```{r}
clusplot(water_data, k2$cluster, color = TRUE, shade = TRUE, labels = 2, lines = 1)
```