Kira Plastinina is a Russian brand that is sold through a defunct chain of retail stores in Russia, Ukraine, Kazakhstan, Belarus, China, Philippines, and Armenia. The brand’s Sales and Marketing team would like to understand their customer’s behavior from data that they have collected over the past year. More specifically, they would like to learn the characteristics of customer groups.
Perform clustering stating insights drawn from your analysis and visualizations. Upon implementation, provide comparisons between the approaches learned this week i.e. K-Means clustering vs Hierarchical clustering highlighting the strengths and limitations of each approach in the context of your analysis.
The analysis will be a success once we have gotten the clusters that the records belong to.
The dataset can be found here.
## Installing the required packages
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.8
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.2 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggplot2)
library(caret)
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
library(caretEnsemble)
##
## Attaching package: 'caretEnsemble'
## The following object is masked from 'package:ggplot2':
##
## autoplot
library(psych)
##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
library(GGally)
## Registered S3 method overwritten by 'GGally':
## method from
## +.gg ggplot2
library(rpart)
library(randomForest)
## randomForest 4.7-1
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:psych':
##
## outlier
## The following object is masked from 'package:dplyr':
##
## combine
## The following object is masked from 'package:ggplot2':
##
## margin
library(superml) # for label encoding
## Loading required package: R6
library(e1071) # Holds the Naive Bayes function.
library(grid)
library(gridExtra)
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:randomForest':
##
## combine
## The following object is masked from 'package:dplyr':
##
## combine
library(heatmaply)
## Loading required package: plotly
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
## Loading required package: viridis
## Loading required package: viridisLite
## Registered S3 methods overwritten by 'registry':
## method from
## print.registry_field proxy
## print.registry_entry proxy
##
## ======================
## Welcome to heatmaply version 1.3.0
##
## Type citation('heatmaply') for how to cite the package.
## Type ?heatmaply for the main documentation.
##
## The github page is: https://github.com/talgalili/heatmaply/
## Please submit your suggestions and bug-reports at: https://github.com/talgalili/heatmaply/issues
## You may ask questions at stackoverflow, use the r and heatmaply tags:
## https://stackoverflow.com/questions/tagged/heatmaply
## ======================
library(ggcorrplot)
library(cluster)
library(purrr)
library(CatEncoders, warn.conflicts = FALSE)
library(devtools)
## Loading required package: usethis
library(magrittr)
##
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
##
## set_names
## The following object is masked from 'package:tidyr':
##
## extract
library(factoextra)
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(NbClust)
data <- read.csv("http://bit.ly/EcommerceCustomersDataset")
## 7.1 Previewing the head of the dataset
head(data)
## Administrative Administrative_Duration Informational Informational_Duration
## 1 0 0 0 0
## 2 0 0 0 0
## 3 0 -1 0 -1
## 4 0 0 0 0
## 5 0 0 0 0
## 6 0 0 0 0
## ProductRelated ProductRelated_Duration BounceRates ExitRates PageValues
## 1 1 0.000000 0.20000000 0.2000000 0
## 2 2 64.000000 0.00000000 0.1000000 0
## 3 1 -1.000000 0.20000000 0.2000000 0
## 4 2 2.666667 0.05000000 0.1400000 0
## 5 10 627.500000 0.02000000 0.0500000 0
## 6 19 154.216667 0.01578947 0.0245614 0
## SpecialDay Month OperatingSystems Browser Region TrafficType
## 1 0 Feb 1 1 1 1
## 2 0 Feb 2 2 1 2
## 3 0 Feb 4 1 9 3
## 4 0 Feb 3 2 2 4
## 5 0 Feb 3 3 1 4
## 6 0 Feb 2 2 1 3
## VisitorType Weekend Revenue
## 1 Returning_Visitor FALSE FALSE
## 2 Returning_Visitor FALSE FALSE
## 3 Returning_Visitor FALSE FALSE
## 4 Returning_Visitor FALSE FALSE
## 5 Returning_Visitor TRUE FALSE
## 6 Returning_Visitor FALSE FALSE
## 7.2 Previewing the tail of the dataset
tail(data)
## Administrative Administrative_Duration Informational
## 12325 0 0 1
## 12326 3 145 0
## 12327 0 0 0
## 12328 0 0 0
## 12329 4 75 0
## 12330 0 0 0
## Informational_Duration ProductRelated ProductRelated_Duration BounceRates
## 12325 0 16 503.000 0.000000000
## 12326 0 53 1783.792 0.007142857
## 12327 0 5 465.750 0.000000000
## 12328 0 6 184.250 0.083333333
## 12329 0 15 346.000 0.000000000
## 12330 0 3 21.250 0.000000000
## ExitRates PageValues SpecialDay Month OperatingSystems Browser Region
## 12325 0.03764706 0.00000 0 Nov 2 2 1
## 12326 0.02903061 12.24172 0 Dec 4 6 1
## 12327 0.02133333 0.00000 0 Nov 3 2 1
## 12328 0.08666667 0.00000 0 Nov 3 2 1
## 12329 0.02105263 0.00000 0 Nov 2 2 3
## 12330 0.06666667 0.00000 0 Nov 3 2 1
## TrafficType VisitorType Weekend Revenue
## 12325 1 Returning_Visitor FALSE FALSE
## 12326 1 Returning_Visitor TRUE FALSE
## 12327 8 Returning_Visitor TRUE FALSE
## 12328 13 Returning_Visitor TRUE FALSE
## 12329 11 Returning_Visitor FALSE FALSE
## 12330 2 New_Visitor TRUE FALSE
dim(data)
## [1] 12330 18
## 7.3 Checking the data types of the variables
str(data)
## 'data.frame': 12330 obs. of 18 variables:
## $ Administrative : int 0 0 0 0 0 0 0 1 0 0 ...
## $ Administrative_Duration: num 0 0 -1 0 0 0 -1 -1 0 0 ...
## $ Informational : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Informational_Duration : num 0 0 -1 0 0 0 -1 -1 0 0 ...
## $ ProductRelated : int 1 2 1 2 10 19 1 1 2 3 ...
## $ ProductRelated_Duration: num 0 64 -1 2.67 627.5 ...
## $ BounceRates : num 0.2 0 0.2 0.05 0.02 ...
## $ ExitRates : num 0.2 0.1 0.2 0.14 0.05 ...
## $ PageValues : num 0 0 0 0 0 0 0 0 0 0 ...
## $ SpecialDay : num 0 0 0 0 0 0 0.4 0 0.8 0.4 ...
## $ Month : chr "Feb" "Feb" "Feb" "Feb" ...
## $ OperatingSystems : int 1 2 4 3 3 2 2 1 2 2 ...
## $ Browser : int 1 2 1 2 3 2 4 2 2 4 ...
## $ Region : int 1 1 9 2 1 1 3 1 2 1 ...
## $ TrafficType : int 1 2 3 4 4 3 3 5 3 2 ...
## $ VisitorType : chr "Returning_Visitor" "Returning_Visitor" "Returning_Visitor" "Returning_Visitor" ...
## $ Weekend : logi FALSE FALSE FALSE FALSE TRUE FALSE ...
## $ Revenue : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
summary(data)
## Administrative Administrative_Duration Informational
## Min. : 0.000 Min. : -1.00 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 0.00 1st Qu.: 0.000
## Median : 1.000 Median : 8.00 Median : 0.000
## Mean : 2.318 Mean : 80.91 Mean : 0.504
## 3rd Qu.: 4.000 3rd Qu.: 93.50 3rd Qu.: 0.000
## Max. :27.000 Max. :3398.75 Max. :24.000
## NA's :14 NA's :14 NA's :14
## Informational_Duration ProductRelated ProductRelated_Duration
## Min. : -1.00 Min. : 0.00 Min. : -1.0
## 1st Qu.: 0.00 1st Qu.: 7.00 1st Qu.: 185.0
## Median : 0.00 Median : 18.00 Median : 599.8
## Mean : 34.51 Mean : 31.76 Mean : 1196.0
## 3rd Qu.: 0.00 3rd Qu.: 38.00 3rd Qu.: 1466.5
## Max. :2549.38 Max. :705.00 Max. :63973.5
## NA's :14 NA's :14 NA's :14
## BounceRates ExitRates PageValues SpecialDay
## Min. :0.000000 Min. :0.00000 Min. : 0.000 Min. :0.00000
## 1st Qu.:0.000000 1st Qu.:0.01429 1st Qu.: 0.000 1st Qu.:0.00000
## Median :0.003119 Median :0.02512 Median : 0.000 Median :0.00000
## Mean :0.022152 Mean :0.04300 Mean : 5.889 Mean :0.06143
## 3rd Qu.:0.016684 3rd Qu.:0.05000 3rd Qu.: 0.000 3rd Qu.:0.00000
## Max. :0.200000 Max. :0.20000 Max. :361.764 Max. :1.00000
## NA's :14 NA's :14
## Month OperatingSystems Browser Region
## Length:12330 Min. :1.000 Min. : 1.000 Min. :1.000
## Class :character 1st Qu.:2.000 1st Qu.: 2.000 1st Qu.:1.000
## Mode :character Median :2.000 Median : 2.000 Median :3.000
## Mean :2.124 Mean : 2.357 Mean :3.147
## 3rd Qu.:3.000 3rd Qu.: 2.000 3rd Qu.:4.000
## Max. :8.000 Max. :13.000 Max. :9.000
##
## TrafficType VisitorType Weekend Revenue
## Min. : 1.00 Length:12330 Mode :logical Mode :logical
## 1st Qu.: 2.00 Class :character FALSE:9462 FALSE:10422
## Median : 2.00 Mode :character TRUE :2868 TRUE :1908
## Mean : 4.07
## 3rd Qu.: 4.00
## Max. :20.00
##
duplicated_rows <- data[duplicated(data),]
count(duplicated_rows)
## n
## 1 119
The data set has 119 duplicated data.
# Dealing with duplicates by dropping them.
new_data <- data[!duplicated(data),]
# Let's confirm the changes made
sum(duplicated(new_data))
## [1] 0
# Checking for missing values
sum(is.na(data))
## [1] 112
# Dropping our missing values
clean_data <- new_data[complete.cases(new_data),]
# Confirm changes made
colSums(is.na(clean_data))
## Administrative Administrative_Duration Informational
## 0 0 0
## Informational_Duration ProductRelated ProductRelated_Duration
## 0 0 0
## BounceRates ExitRates PageValues
## 0 0 0
## SpecialDay Month OperatingSystems
## 0 0 0
## Browser Region TrafficType
## 0 0 0
## VisitorType Weekend Revenue
## 0 0 0
# changing column names to lowercase
colnames(clean_data) = tolower(colnames(clean_data))
print(colnames(clean_data))
## [1] "administrative" "administrative_duration"
## [3] "informational" "informational_duration"
## [5] "productrelated" "productrelated_duration"
## [7] "bouncerates" "exitrates"
## [9] "pagevalues" "specialday"
## [11] "month" "operatingsystems"
## [13] "browser" "region"
## [15] "traffictype" "visitortype"
## [17] "weekend" "revenue"
# Changing the datatypes of some of the columns into factors
# Making a list of the columns
fact_cols = c('month', 'operatingsystems', 'browser', 'region', 'traffictype', 'visitortype')
print(fact_cols)
## [1] "month" "operatingsystems" "browser" "region"
## [5] "traffictype" "visitortype"
#Changing columns to factors
clean_data[ ,fact_cols] %<>% lapply(function(x) as.factor(as.character(x)))
# Checking whether the data types have changed
str(clean_data)
## 'data.frame': 12199 obs. of 18 variables:
## $ administrative : int 0 0 0 0 0 0 0 1 0 0 ...
## $ administrative_duration: num 0 0 -1 0 0 0 -1 -1 0 0 ...
## $ informational : int 0 0 0 0 0 0 0 0 0 0 ...
## $ informational_duration : num 0 0 -1 0 0 0 -1 -1 0 0 ...
## $ productrelated : int 1 2 1 2 10 19 1 1 2 3 ...
## $ productrelated_duration: num 0 64 -1 2.67 627.5 ...
## $ bouncerates : num 0.2 0 0.2 0.05 0.02 ...
## $ exitrates : num 0.2 0.1 0.2 0.14 0.05 ...
## $ pagevalues : num 0 0 0 0 0 0 0 0 0 0 ...
## $ specialday : num 0 0 0 0 0 0 0.4 0 0.8 0.4 ...
## $ month : Factor w/ 10 levels "Aug","Dec","Feb",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ operatingsystems : Factor w/ 8 levels "1","2","3","4",..: 1 2 4 3 3 2 2 1 2 2 ...
## $ browser : Factor w/ 13 levels "1","10","11",..: 1 6 1 6 7 6 8 6 6 8 ...
## $ region : Factor w/ 9 levels "1","2","3","4",..: 1 1 9 2 1 1 3 1 2 1 ...
## $ traffictype : Factor w/ 20 levels "1","10","11",..: 1 12 14 15 15 14 14 16 14 12 ...
## $ visitortype : Factor w/ 3 levels "New_Visitor",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ weekend : logi FALSE FALSE FALSE FALSE TRUE FALSE ...
## $ revenue : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
# using describe from the psych package gives more statistical summaries including mean, median, skew, kurtosis, min, max and variance.
describe(clean_data)
## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning Inf
## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning Inf
## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning -Inf
## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning -Inf
## vars n mean sd median trimmed mad min
## administrative 1 12199 2.34 3.33 1.00 1.66 1.48 0
## administrative_duration 2 12199 81.68 177.53 9.00 42.87 13.34 -1
## informational 3 12199 0.51 1.28 0.00 0.18 0.00 0
## informational_duration 4 12199 34.84 141.46 0.00 3.73 0.00 -1
## productrelated 5 12199 32.06 44.60 18.00 23.06 19.27 0
## productrelated_duration 6 12199 1207.51 1919.93 609.54 832.36 745.12 -1
## bouncerates 7 12199 0.02 0.05 0.00 0.01 0.00 0
## exitrates 8 12199 0.04 0.05 0.03 0.03 0.02 0
## pagevalues 9 12199 5.95 18.66 0.00 1.33 0.00 0
## specialday 10 12199 0.06 0.20 0.00 0.00 0.00 0
## month* 11 12199 6.17 2.37 7.00 6.36 1.48 1
## operatingsystems* 12 12199 2.12 0.91 2.00 2.06 0.00 1
## browser* 13 12199 5.33 2.46 6.00 5.38 0.00 1
## region* 14 12199 3.15 2.40 3.00 2.79 2.97 1
## traffictype* 15 12199 9.98 5.69 12.00 10.18 2.97 1
## visitortype* 16 12199 2.72 0.69 3.00 2.89 0.00 1
## weekend 17 12199 NaN NA NA NaN NA Inf
## revenue 18 12199 NaN NA NA NaN NA Inf
## max range skew kurtosis se
## administrative 27.00 27.00 1.95 4.63 0.03
## administrative_duration 3398.75 3399.75 5.59 50.09 1.61
## informational 24.00 24.00 4.01 26.64 0.01
## informational_duration 2549.38 2550.38 7.54 75.45 1.28
## productrelated 705.00 705.00 4.33 31.04 0.40
## productrelated_duration 63973.52 63974.52 7.25 136.57 17.38
## bouncerates 0.20 0.20 3.15 9.25 0.00
## exitrates 0.20 0.20 2.23 4.62 0.00
## pagevalues 361.76 361.76 6.35 64.93 0.17
## specialday 1.00 1.00 3.28 9.78 0.00
## month* 10.00 9.00 -0.83 -0.37 0.02
## operatingsystems* 8.00 7.00 2.03 10.27 0.01
## browser* 13.00 12.00 -0.53 0.11 0.02
## region* 9.00 8.00 0.98 -0.16 0.02
## traffictype* 20.00 19.00 -0.58 -1.13 0.05
## visitortype* 3.00 2.00 -2.05 2.23 0.01
## weekend -Inf -Inf NA NA NA
## revenue -Inf -Inf NA NA NA
# Getting the modes
# Creating a function to get the modes
getmode <- function(v) {
uniqv <- unique(v)
uniqv[which.max(tabulate(match(v, uniqv)))]
}
month.mode <- getmode(clean_data$month)
month.mode
## [1] May
## Levels: Aug Dec Feb Jul June Mar May Nov Oct Sep
operatingsystems.mode <- getmode(clean_data$operatingsystems)
operatingsystems.mode
## [1] 2
## Levels: 1 2 3 4 5 6 7 8
browser.mode <- getmode(clean_data$browser)
browser.mode
## [1] 2
## Levels: 1 10 11 12 13 2 3 4 5 6 7 8 9
region.mode <- getmode(clean_data$region)
region.mode
## [1] 1
## Levels: 1 2 3 4 5 6 7 8 9
traffictype.mode <- getmode(clean_data$traffictype)
traffictype.mode
## [1] 2
## Levels: 1 10 11 12 13 14 15 16 17 18 19 2 20 3 4 5 6 7 8 9
traffictype.mode <- getmode(clean_data$traffictype)
traffictype.mode
## [1] 2
## Levels: 1 10 11 12 13 14 15 16 17 18 19 2 20 3 4 5 6 7 8 9
visitortype.mode <- getmode(clean_data$visitortype)
visitortype.mode
## [1] Returning_Visitor
## Levels: New_Visitor Other Returning_Visitor
options(repr.plot.width = 7, repr.plot.height = 5)
clean_data %>%
ggplot(aes(visitortype, productrelated, col = revenue)) +
geom_boxplot() +
labs(x = 'Visitor Type', y = 'Product Related', title = 'Box plot of product related feature per visitor type') +
scale_color_brewer(palette = 'Set1') +
theme(legend.position = 'top')
# Plotting histograms
fac_cols = c('month', 'operatingsystems', 'browser', 'region')
columns = colnames(select(clean_data, fac_cols))
## Note: Using an external vector in selections is ambiguous.
## i Use `all_of(fac_cols)` instead of `fac_cols` to silence this message.
## i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
p = list()
options(repr.plot.width = 10, repr.plot.height = 6)
for (i in 1:4){
p[[i]] = clean_data %>%
ggplot(aes_string(columns[i])) +
geom_bar(color = 'blue') +
labs(y = 'Frequency', x = '', title = toupper(columns[i])) +
theme(plot.title = element_text(size = 10), axis.title.y = element_text(size = 10))
}
do.call(grid.arrange,p)
# Plotting a correlogram to check for correlations
options(repr.plot.width = 6, repr.plot.height = 5)
corr = round(cor(select_if(clean_data, is.numeric)), 2)
ggcorrplot(corr, hc.order = T, ggtheme = ggplot2::theme_gray,
colors = c("red", "white", "blue"), lab = F)
# Plotting scatter plots to check for correlations
options(repr.plot.width = 11, repr.plot.height = 5)
sc1 = ggplot(clean_data, aes(productrelated, productrelated_duration, col = revenue)) +
geom_point() + theme(legend.position = 'none') +
labs(x='Product related', y ='Product related duration')
sc2 = ggplot(clean_data, aes(administrative, administrative_duration, col = revenue)) +
geom_point() + theme(legend.position = 'none') +
labs(x = 'Administrative', y = 'Administrative duration')
sc3 = ggplot(clean_data, aes(informational, informational_duration, col = revenue)) +
geom_point() + theme(legend.position = 'none') +
labs(x = 'Informational', y = 'Informational duration')
sc4 = ggplot(clean_data, aes(pagevalues, specialday , col = revenue)) +
geom_point() + theme(legend.position = 'none') +
labs(x = 'Page values', y = 'Special day')
sc5 = ggplot(clean_data, aes(exitrates, bouncerates)) +
geom_point(aes( col = weekend)) + theme(legend.position = 'none') +
labs(x = 'Exit Rates', y = 'Bounce Rates')
grid.arrange(sc1, sc2, sc3, sc4, sc5, ncol = 3, nrow = 2,
top = textGrob("Scatter plots",gp=gpar(fontsize=14,font=3)))
Encoding categorical columns
# Creating a copy of the cleaned dataframe
original_cleandata = data.table::copy(clean_data)
# Label encoding some of the columns
month = data.frame(model.matrix(~0+clean_data$month))
os = data.frame(model.matrix(~0+clean_data$operatingsystems))
brws = data.frame(model.matrix(~0+clean_data$browser))
rgn = data.frame(model.matrix(~0+clean_data$region))
traf = data.frame(model.matrix(~0+clean_data$traffictype))
vt = data.frame(model.matrix(~0+clean_data$visitortype))
wknd = data.frame(model.matrix(~0+clean_data$weekend))
rev = data.frame(model.matrix(~0+clean_data$revenue))
# Dropping columns which have already encoded
drop_cols = c('month', 'operatingsystems', 'browser', 'region', 'traffictype', 'visitortype', 'weekend', 'revenue')
clean_data = select(data.frame(cbind(clean_data, month, os, brws, rgn, traf, vt, wknd, rev)), -drop_cols)
## Note: Using an external vector in selections is ambiguous.
## i Use `all_of(drop_cols)` instead of `drop_cols` to silence this message.
## i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
# Normalising the data
clean_data = as.data.frame(apply(clean_data, 2, function(x) (x - min(x))/max(x) - min(x)))
# Using the elbow method to find the optimal number of clusters
fviz_nbclust(x = clean_data,FUNcluster = kmeans, method = 'wss' )
# Performing clustering with the optimal number of clusters
kmeans_res = kmeans(clean_data, 4)
# Checking the cluster centers of each variable
kmeans_res$centers
## administrative administrative_duration informational informational_duration
## 1 0.07270233 1.020702 0.01725440 1.011265
## 2 0.07995737 1.024129 0.01959335 1.011295
## 3 0.08290721 1.022533 0.01901496 1.012762
## 4 0.11212134 1.030820 0.02994840 1.021359
## productrelated productrelated_duration bouncerates exitrates pagevalues
## 1 0.04010250 1.016415 0.12579521 0.2401570 0.011538823
## 2 0.03412530 1.013790 0.12227867 0.2269629 0.014650105
## 3 0.04540390 1.019068 0.10500293 0.2177918 0.005709942
## 4 0.06122086 1.025689 0.05694488 0.1419438 0.040207990
## specialday clean_data.monthAug clean_data.monthDec clean_data.monthFeb
## 1 0.230102443 0.00000000 0.0000000 0.000000000
## 2 0.052739456 0.03744580 0.1474182 0.026409145
## 3 0.004715484 0.04806166 0.2106098 0.021990478
## 4 0.006413564 0.04644305 0.1485440 0.006634722
## clean_data.monthJul clean_data.monthJune clean_data.monthMar
## 1 0.00000000 0.00000000 0.0000000
## 2 0.03547497 0.02286165 0.1454474
## 3 0.04896849 0.04012696 0.2326003
## 4 0.04644305 0.01842978 0.1688168
## clean_data.monthMay clean_data.monthNov clean_data.monthOct
## 1 1.00000000 0.0000000 0.00000000
## 2 0.24477730 0.2510840 0.05873078
## 3 0.00000000 0.2885967 0.05486284
## 4 0.06229266 0.3955031 0.05823811
## clean_data.monthSep clean_data.operatingsystems1 clean_data.operatingsystems2
## 1 0.00000000 0.02127660 0.6970055
## 2 0.03035081 0.88884509 0.0000000
## 3 0.05418273 0.02947178 0.6880526
## 4 0.04865463 0.04017693 0.6384077
## clean_data.operatingsystems3 clean_data.operatingsystems4
## 1 0.2592593 0.01930654
## 2 0.0000000 0.10642491
## 3 0.2482430 0.02131036
## 4 0.2863988 0.02395872
## clean_data.operatingsystems5 clean_data.operatingsystems6
## 1 0.0000000000 0.002758077
## 2 0.0000000000 0.000000000
## 3 0.0009068238 0.001813648
## 4 0.0007371913 0.001474383
## clean_data.operatingsystems7 clean_data.operatingsystems8 clean_data.browser1
## 1 0.0000000000 0.000394011 0.0011820331
## 2 0.0023649980 0.002364998 0.9456050453
## 3 0.0000000000 0.010201768 0.0004534119
## 4 0.0003685957 0.008477700 0.0081091043
## clean_data.browser10 clean_data.browser11 clean_data.browser12
## 1 0.019306541 0.0000000000 0.000394011
## 2 0.001182499 0.0000000000 0.000000000
## 3 0.015416005 0.0009068238 0.001133530
## 4 0.015849613 0.0007371913 0.001474383
## clean_data.browser13 clean_data.browser2 clean_data.browser3
## 1 0.0000000000 0.8006304 0.01339638
## 2 0.0003941663 0.0000000 0.00000000
## 3 0.0086148266 0.8213557 0.00952165
## 4 0.0062661261 0.8193881 0.01068927
## clean_data.browser4 clean_data.browser5 clean_data.browser6
## 1 0.0839243499 0.051615445 0.023640662
## 2 0.0007883327 0.001970832 0.001182499
## 3 0.0766266153 0.043754251 0.017456359
## 4 0.0652414302 0.050497604 0.012532252
## clean_data.browser7 clean_data.browser8 clean_data.browser9
## 1 0.005516154 0.000394011 0.000000000
## 2 0.000000000 0.048876626 0.000000000
## 3 0.004534119 0.000000000 0.000226706
## 4 0.005528935 0.003685957 0.000000000
## clean_data.region1 clean_data.region2 clean_data.region3 clean_data.region4
## 1 0.3451537 0.10874704 0.2001576 0.10598897
## 2 0.4174222 0.08356326 0.2104848 0.10642491
## 3 0.3840399 0.08864203 0.1897529 0.08682838
## 4 0.3988205 0.09141172 0.1854036 0.09067453
## clean_data.region5 clean_data.region6 clean_data.region7 clean_data.region8
## 1 0.03230890 0.07919622 0.06422380 0.03703704
## 2 0.01497832 0.06109578 0.03862830 0.03941663
## 3 0.02697801 0.06461120 0.07073226 0.03377919
## 4 0.02875046 0.05860671 0.06819020 0.03243642
## clean_data.region9 clean_data.traffictype1 clean_data.traffictype10
## 1 0.02718676 0.1256895 0.00000000
## 2 0.02798581 0.1545132 0.04454080
## 3 0.05463614 0.2666062 0.04828837
## 4 0.04570586 0.1828234 0.04570586
## clean_data.traffictype11 clean_data.traffictype12 clean_data.traffictype13
## 1 0.02482270 0.000000000 0.118991332
## 2 0.01773749 0.000000000 0.001182499
## 3 0.01314895 0.000226706 0.070958966
## 4 0.02985625 0.000000000 0.040545522
## clean_data.traffictype14 clean_data.traffictype15 clean_data.traffictype16
## 1 0.0031520883 0.0035460993 0.0007880221
## 2 0.0000000000 0.0094599921 0.0000000000
## 3 0.0004534119 0.0002267060 0.0000000000
## 4 0.0011057870 0.0007371913 0.0003685957
## clean_data.traffictype17 clean_data.traffictype18 clean_data.traffictype19
## 1 0.0000000000 0.00394011 0.0047281324
## 2 0.0003941663 0.00000000 0.0011824990
## 3 0.0000000000 0.00000000 0.0000000000
## 4 0.0000000000 0.00000000 0.0007371913
## clean_data.traffictype2 clean_data.traffictype20 clean_data.traffictype3
## 1 0.1863672 0.002758077 0.19897557
## 2 0.3370122 0.010248325 0.24674813
## 3 0.3205622 0.023577420 0.14169123
## 4 0.4294139 0.020641356 0.09620346
## clean_data.traffictype4 clean_data.traffictype5 clean_data.traffictype6
## 1 0.21197794 0.02482270 0.08707644
## 2 0.09026409 0.02404415 0.01694915
## 3 0.03922013 0.01790977 0.02448424
## 4 0.04644305 0.02100995 0.02617029
## clean_data.traffictype7 clean_data.traffictype8 clean_data.traffictype9
## 1 0.002364066 0.00000000 0.000000000
## 2 0.001970832 0.03586914 0.007883327
## 3 0.002947178 0.02947178 0.000226706
## 4 0.005897530 0.04496867 0.007371913
## clean_data.visitortypeNew_Visitor clean_data.visitortypeOther
## 1 0.07052797 0.000000000
## 2 0.15727237 0.002364998
## 3 0.12446157 0.012015416
## 4 0.20862514 0.008109104
## clean_data.visitortypeReturning_Visitor clean_data.weekendFALSE
## 1 0.9294720 0.8494878
## 2 0.8403626 0.7185652
## 3 0.8635230 1.0000000
## 4 0.7832658 0.3512717
## clean_data.weekendTRUE clean_data.revenueFALSE clean_data.revenueTRUE
## 1 0.1505122 0.9208038 0.07919622
## 2 0.2814348 0.8588885 0.14111155
## 3 0.0000000 1.0000000 0.00000000
## 4 0.6487283 0.5027645 0.49723553
# Visualising the clusters of the whole dataset
options(repr.plot.width = 11, repr.plot.height = 6)
fviz_cluster(kmeans_res, clean_data)
# determining k using the silhouette method
fviz_nbclust(x = clean_data,FUNcluster = kmeans, method = 'silhouette' )
# using gap statistic
set.seed(42)
clust_gap <- clusGap(x = clean_data, FUN = kmeans, K.max = 15, nstart = 25,
B = 5)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: Quick-TRANSfer stage steps exceeded maximum (= 609950)
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
fviz_gap_stat(clust_gap)
d <- dist(clean_data, method="euclidean")
# Clustering algorithm deployment
model <- hclust(d, method="ward.D2")
# viewing the dendogram
plot(model, cex=0.6, hang=-1)
# Ward's method
hc <- hclust(d, method="ward.D2")
# cut the tree into 5 parts
sub_grp <- cutree(hc, k=4)
table(sub_grp)
## sub_grp
## 1 2 3 4
## 2899 4100 2844 2356
plot(hc, cex=2, hang=-1 )
rect.hclust(hc, k=4, border=2:5)
K-means has more elaborate clusters as compared to hierarchical. It should therefore be considered when performing clustering.