2022-03-31

Project goal

Main goal of project is to create example of web application that will enable user to visualize kmens algorithm. As an example of dataset used in clustering is US arrests for which dataset is avalible in R. In order to visualize clustering we will use library factoextra.

First chart visualizes clusters and groupings of points in transformed space. It identifies centroids of clusters and displays it chart. Changing number of clusters changes charts.

Secund chart visualizes silhouette plot R and vertical line in plot is displayed according to number of centroids chosen in Shiny web app.

Parameters of Shiny app

  • number of centroids
  • nstart parameter

User will be able to set number of clusters and charts will be generated, based on this parameter. Both charts will simultaneously be redrawn.

Nstart parameter will also be adjusted and option that attempts multiple initial configurations in order to select best centroid.

Basic functions for displaying charts

Two basic functions are used for creating charts and performing clustering are following:

  • apply_number_cluster <- function(nclusters, nstarts, dataframe) { out_res <- kmeans(dataframe, nclusters, nstart=nstarts) fviz_cluster(out_res, data=dataframe, nstart=nstarts, geom = c(“point”)) }

  • apply_silhouette_analysis <- function(nclusters, nstarts, dataframe) { fviz_nbclust(dataframe, kmeans, method = “wss”) + geom_vline(xintercept = nclusters, linetype = 2) + labs(subtitle = “Elbow method”) }

UI part of application sliders

Following code contains two slider from ui.R

  sliderInput(inputId = "centers_number",
              label = "Number of centers:",
              min = 2,
              max = 10,
              value = 5),
  
  sliderInput(inputId = "nstart",
              label = "n start parameter:",
              min = 10,
              max = 100,
              value = 50)
  
)