26/07/2020

Coursera Reproducible Pitch

Overview

K-means clustering is one of the simplest and popular unsupervised machine learning algorithms. Typically, unsupervised algorithms make inferences from datasets using only input vectors without referring to known, or labelled, outcomes. We’re going to work with mtcars data.The shiny application will show you, for the numbers of centroids selected, the clusters chosen by the kmeans method.

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

UI Code

library(shiny)
# Define UI for application that draws a histogram
shinyUI(fluidPage(
    # Application title
    titlePanel("K Means applied to mtcars data"),
    # Sidebar with a slider input for number of bins
    sidebarLayout(
        sidebarPanel(
           sliderInput("sliderKM", "Choose the number of clusters:", 2, 10, value=2)
        ),
        # Show a plot of the clusters
        mainPanel(
            plotOutput("plot1", brush = brushOpts(id="brush1"))
        )
    )
))

Server Code

library(shiny)
library(miniUI)
library(factoextra)
library(tidyverse)
# Define server logic required to draw a histogram
shinyServer(function(input, output) {
    mtcars_num <- mtcars %>% 
        dplyr::select(mpg,disp:qsec) 
    mtcars_num_sc <- scale(mtcars_num)
    km.out <- reactive({
        kmeans(mtcars_num_sc,centers=input$sliderKM,nstart = 25)
        })
    output$plot1 = renderPlot({
        fviz_cluster(km.out(), mtcars_num_sc, ellipse.type = "norm")
    })
})