In this project, I will be classifying the type of people personality. By the end of this project I hope we will be able to see that there are types of people according to this dataset. This project will be conducted using K-means clustering
personality <- read.csv("Personality classification Data.csv")anyNA(personality)## [1] TRUE
colSums(is.na(personality))## Worry.about.things. Make.friends.easily.
## 5 5
## Have.a.vivid.imagination. Trust.others
## 2 0
## Get.angry.easily. Love.large.parties.
## 2 0
## Would.never.cheat.to.get.ahead. Like.order.
## 5 0
## Often.feel.unhappy. Take.charge.
## 2 0
## Make.people.feel.welcome. Try.to.follow.the.rules.
## 1 0
## Am.always.busy. Am.easy.to.satisfy.
## 3 0
## Go.straight.for.the.goal. Often.overindulge.
## 0 3
## Panic.easily. Avoid.mistakes.
## 7 0
## Warm.up.quickly.to.others. Get.irritated.easily.
## 4 6
## Stick.to.the.rules. Try.to.lead.others.
## 6 0
## Keep.my.promises. Feel.others..emotions
## 0 2
## Like.to.visit.new.places. Work.hard.
## 0 13
## Become.overwhelmed.by.events. Choose.my.words.with.care.
## 0 0
## I.have.little.to.say. I.am.quiet.around.strangers.
## 6 9
Now let’s leave out the data with missing values
personality_clean <-
personality |>
na.exclude()
anyNA(personality_clean)## [1] FALSE
library(dplyr)
glimpse(personality_clean)## Rows: 68
## Columns: 30
## $ Worry.about.things. <int> -1, 0, -1, -1, -1, 2, 0, -2, 2, -1, -2…
## $ Make.friends.easily. <int> 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1,…
## $ Have.a.vivid.imagination. <int> 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1,…
## $ Trust.others <int> 1, 2, 1, 1, 2, 0, 0, 0, 2, 1, 0, 0, 1,…
## $ Get.angry.easily. <int> -1, 0, -1, -1, -1, -2, -2, -2, 0, -1, …
## $ Love.large.parties. <int> 0, 1, 1, 1, 0, 1, 2, 2, 1, 1, 2, 2, 1,…
## $ Would.never.cheat.to.get.ahead. <int> -1, -2, -1, -2, -2, -1, -2, -1, -2, -1…
## $ Like.order. <int> 1, 2, 1, 1, 2, 0, 0, 0, 2, 1, 0, 0, 1,…
## $ Often.feel.unhappy. <int> 0, -1, 0, 1, -1, 1, 1, 0, -1, 0, 0, 1,…
## $ Take.charge. <int> 1, 2, 1, 1, 2, 0, 0, 0, 2, 1, 0, 0, 1,…
## $ Make.people.feel.welcome. <int> 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1,…
## $ Try.to.follow.the.rules. <int> 1, 2, 1, 1, 2, 1, 0, 0, 2, 1, 0, 0, 1,…
## $ Am.always.busy. <int> -1, -2, -1, 2, -1, -2, -1, -1, -2, -1,…
## $ Am.easy.to.satisfy. <int> 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1,…
## $ Go.straight.for.the.goal. <int> 2, 1, 0, 2, 0, 0, 2, 1, 1, 0, 1, 2, 0,…
## $ Often.overindulge. <int> 0, 2, 1, 1, 2, 0, 0, 0, 2, 1, 0, 0, 1,…
## $ Panic.easily. <int> -1, -2, -1, 2, -1, -1, 2, -2, -2, -1, …
## $ Avoid.mistakes. <int> 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1,…
## $ Warm.up.quickly.to.others. <int> 0, 0, 1, 1, 1, 0, -1, 0, 1, 1, 0, -1, …
## $ Get.irritated.easily. <int> 1, -1, 1, 0, -1, 0, 1, 1, -1, 0, 0, 1,…
## $ Stick.to.the.rules. <int> 1, 1, 2, 0, 1, 2, 2, 0, 1, 2, 2, 2, 1,…
## $ Try.to.lead.others. <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,…
## $ Keep.my.promises. <int> 2, 1, 0, 2, 2, 1, 2, 0, 2, 1, 0, 2, 1,…
## $ Feel.others..emotions <int> 1, 2, 1, 1, 2, 2, 1, 1, 1, 0, 1, 1, 0,…
## $ Like.to.visit.new.places. <int> 2, 2, 2, 2, 1, 2, 1, 1, 1, 0, 1, 1, 0,…
## $ Work.hard. <int> 1, 2, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 2,…
## $ Become.overwhelmed.by.events. <int> 1, 1, 0, 0, -2, 1, 1, -1, -1, 0, -2, 1…
## $ Choose.my.words.with.care. <int> 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1,…
## $ I.have.little.to.say. <int> 1, 0, 1, 0, 1, 0, 1, 0, 1, 2, 0, 1, 0,…
## $ I.am.quiet.around.strangers. <int> 1, -1, 1, 0, -1, 0, -2, -2, 0, -1, 0, …
check out the data set
personality |> head(4)## Worry.about.things. Make.friends.easily. Have.a.vivid.imagination.
## 1 -1 0 1
## 2 0 1 1
## 3 -1 1 1
## 4 -2 1 0
## Trust.others Get.angry.easily. Love.large.parties.
## 1 1 -1 0
## 2 2 0 1
## 3 1 -1 1
## 4 0 -2 2
## Would.never.cheat.to.get.ahead. Like.order. Often.feel.unhappy. Take.charge.
## 1 -1 1 0 1
## 2 -2 2 -1 2
## 3 -1 1 0 1
## 4 -1 0 0 0
## Make.people.feel.welcome. Try.to.follow.the.rules. Am.always.busy.
## 1 0 1 -1
## 2 1 2 -2
## 3 1 1 -1
## 4 NA 0 -1
## Am.easy.to.satisfy. Go.straight.for.the.goal. Often.overindulge.
## 1 0 2 0
## 2 1 1 2
## 3 1 0 1
## 4 1 1 0
## Panic.easily. Avoid.mistakes. Warm.up.quickly.to.others.
## 1 -1 0 0
## 2 -2 0 0
## 3 -1 1 1
## 4 0 1 -1
## Get.irritated.easily. Stick.to.the.rules. Try.to.lead.others.
## 1 1 1 1
## 2 -1 1 1
## 3 1 2 1
## 4 1 1 1
## Keep.my.promises. Feel.others..emotions Like.to.visit.new.places. Work.hard.
## 1 2 1 2 1
## 2 1 2 2 2
## 3 0 1 2 1
## 4 1 0 1 0
## Become.overwhelmed.by.events. Choose.my.words.with.care.
## 1 1 0
## 2 1 1
## 3 0 1
## 4 -1 0
## I.have.little.to.say. I.am.quiet.around.strangers.
## 1 1 1
## 2 0 -1
## 3 1 1
## 4 0 1
check the data type
library(dplyr)
glimpse(personality)## Rows: 122
## Columns: 30
## $ Worry.about.things. <int> -1, 0, -1, -2, -1, -1, 2, 0, -1, -2, -…
## $ Make.friends.easily. <int> 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1,…
## $ Have.a.vivid.imagination. <int> 1, 1, 1, 0, 1, 0, 1, 1, NA, 0, 1, 0, 1…
## $ Trust.others <int> 1, 2, 1, 0, 1, 2, 0, 0, 1, 0, 1, 2, 2,…
## $ Get.angry.easily. <int> -1, 0, -1, -2, -1, -1, -2, -2, -1, -2,…
## $ Love.large.parties. <int> 0, 1, 1, 2, 1, 0, 1, 2, 1, 2, 1, 0, 1,…
## $ Would.never.cheat.to.get.ahead. <int> -1, -2, -1, -1, -2, -2, -1, -2, -1, -1…
## $ Like.order. <int> 1, 2, 1, 0, 1, 2, 0, 0, 1, 0, 1, 2, 2,…
## $ Often.feel.unhappy. <int> 0, -1, 0, 0, 1, -1, 1, 1, 0, 0, 1, -1,…
## $ Take.charge. <int> 1, 2, 1, 0, 1, 2, 0, 0, 1, 0, 1, 2, 2,…
## $ Make.people.feel.welcome. <int> 0, 1, 1, NA, 1, 0, 1, 0, 1, 1, 1, 0, 1…
## $ Try.to.follow.the.rules. <int> 1, 2, 1, 0, 1, 2, 1, 0, 1, 0, 1, 2, 2,…
## $ Am.always.busy. <int> -1, -2, -1, -1, 2, -1, -2, -1, -1, -1,…
## $ Am.easy.to.satisfy. <int> 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1,…
## $ Go.straight.for.the.goal. <int> 2, 1, 0, 1, 2, 0, 0, 2, 0, 1, 2, 0, 1,…
## $ Often.overindulge. <int> 0, 2, 1, 0, 1, 2, 0, 0, 1, 0, 1, 2, 2,…
## $ Panic.easily. <int> -1, -2, -1, 0, 2, -1, -1, 2, -1, -2, -…
## $ Avoid.mistakes. <int> 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1,…
## $ Warm.up.quickly.to.others. <int> 0, 0, 1, -1, 1, 1, 0, -1, 0, 0, 1, -1,…
## $ Get.irritated.easily. <int> 1, -1, 1, 1, 0, -1, 0, 1, -1, 1, 1, 0,…
## $ Stick.to.the.rules. <int> 1, 1, 2, 1, 0, 1, 2, 2, 1, 0, 1, 0, 1,…
## $ Try.to.lead.others. <int> 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1,…
## $ Keep.my.promises. <int> 2, 1, 0, 1, 2, 2, 1, 2, 1, 0, 2, 1, 2,…
## $ Feel.others..emotions <int> 1, 2, 1, 0, 1, 2, 2, 1, 0, 1, 1, NA, 1…
## $ Like.to.visit.new.places. <int> 2, 2, 2, 1, 2, 1, 2, 1, 0, 1, 2, 2, 1,…
## $ Work.hard. <int> 1, 2, 1, 0, 1, 2, 1, 1, 2, 1, 1, 0, 1,…
## $ Become.overwhelmed.by.events. <int> 1, 1, 0, -1, 0, -2, 1, 1, 0, -1, 0, -2…
## $ Choose.my.words.with.care. <int> 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0,…
## $ I.have.little.to.say. <int> 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1,…
## $ I.am.quiet.around.strangers. <int> 1, -1, 1, 1, 0, -1, 0, -2, -2, -2, -1,…
All type of data seems to be appropriate. Now let’s take it to the next step
Check the scale between variables
summary(personality_clean)## Worry.about.things. Make.friends.easily. Have.a.vivid.imagination.
## Min. :-2.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:-1.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median :-1.0000 Median :1.0000 Median :1.0000
## Mean :-0.3676 Mean :0.6176 Mean :0.7059
## 3rd Qu.: 0.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. : 2.0000 Max. :1.0000 Max. :1.0000
## Trust.others Get.angry.easily. Love.large.parties.
## Min. :0.0000 Min. :-2.000 Min. :0.000
## 1st Qu.:0.0000 1st Qu.:-2.000 1st Qu.:1.000
## Median :1.0000 Median :-1.000 Median :1.000
## Mean :0.8676 Mean :-1.309 Mean :1.088
## 3rd Qu.:2.0000 3rd Qu.:-1.000 3rd Qu.:2.000
## Max. :2.0000 Max. : 0.000 Max. :2.000
## Would.never.cheat.to.get.ahead. Like.order. Often.feel.unhappy.
## Min. :-2.000 Min. :0.0000 Min. :-1.0000
## 1st Qu.:-2.000 1st Qu.:0.0000 1st Qu.:-1.0000
## Median :-2.000 Median :1.0000 Median : 0.0000
## Mean :-1.529 Mean :0.8676 Mean : 0.1176
## 3rd Qu.:-1.000 3rd Qu.:2.0000 3rd Qu.: 1.0000
## Max. :-1.000 Max. :2.0000 Max. : 1.0000
## Take.charge. Make.people.feel.welcome. Try.to.follow.the.rules.
## Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median :1.0000 Median :1.0000 Median :1.0000
## Mean :0.8676 Mean :0.6912 Mean :0.9853
## 3rd Qu.:2.0000 3rd Qu.:1.0000 3rd Qu.:2.0000
## Max. :2.0000 Max. :1.0000 Max. :2.0000
## Am.always.busy. Am.easy.to.satisfy. Go.straight.for.the.goal.
## Min. :-2.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:-1.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median :-1.0000 Median :1.0000 Median :1.0000
## Mean :-0.7647 Mean :0.6912 Mean :0.8088
## 3rd Qu.:-1.0000 3rd Qu.:1.0000 3rd Qu.:2.0000
## Max. : 2.0000 Max. :1.0000 Max. :2.0000
## Often.overindulge. Panic.easily. Avoid.mistakes.
## Min. :0.0000 Min. :-2.0000 Min. :0.00
## 1st Qu.:0.0000 1st Qu.:-1.0000 1st Qu.:0.75
## Median :1.0000 Median :-1.0000 Median :1.00
## Mean :0.8529 Mean :-0.4265 Mean :0.75
## 3rd Qu.:2.0000 3rd Qu.: 0.0000 3rd Qu.:1.00
## Max. :2.0000 Max. : 2.0000 Max. :1.00
## Warm.up.quickly.to.others. Get.irritated.easily. Stick.to.the.rules.
## Min. :-1.0000 Min. :-1.0000 Min. :0.000
## 1st Qu.:-0.2500 1st Qu.:-1.0000 1st Qu.:1.000
## Median : 0.0000 Median : 0.0000 Median :1.000
## Mean : 0.1618 Mean : 0.2059 Mean :1.294
## 3rd Qu.: 1.0000 3rd Qu.: 1.0000 3rd Qu.:2.000
## Max. : 1.0000 Max. : 1.0000 Max. :2.000
## Try.to.lead.others. Keep.my.promises. Feel.others..emotions
## Min. :0.0000 Min. :0.000 Min. :0.0000
## 1st Qu.:1.0000 1st Qu.:1.000 1st Qu.:0.0000
## Median :1.0000 Median :2.000 Median :1.0000
## Mean :0.7647 Mean :1.412 Mean :0.8824
## 3rd Qu.:1.0000 3rd Qu.:2.000 3rd Qu.:1.0000
## Max. :1.0000 Max. :2.000 Max. :2.0000
## Like.to.visit.new.places. Work.hard. Become.overwhelmed.by.events.
## Min. :0.000 Min. :0.00 Min. :-2.00000
## 1st Qu.:1.000 1st Qu.:1.00 1st Qu.: 0.00000
## Median :1.000 Median :1.00 Median : 0.00000
## Mean :1.162 Mean :1.25 Mean : 0.02941
## 3rd Qu.:2.000 3rd Qu.:2.00 3rd Qu.: 1.00000
## Max. :2.000 Max. :2.00 Max. : 1.00000
## Choose.my.words.with.care. I.have.little.to.say. I.am.quiet.around.strangers.
## Min. :0.0000 Min. :0.0000 Min. :-2.000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:-2.000
## Median :1.0000 Median :0.5000 Median :-2.000
## Mean :0.5588 Mean :0.5294 Mean :-1.147
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.: 0.000
## Max. :1.0000 Max. :2.0000 Max. : 1.000
The data does not need to be scaled since the data has given us the same scale which is:
-2 : Strongly Disaggree -1 : Disaggree 0 : Neutral 1 : Agree 2 : Strongly Agree
library(factoextra)## Loading required package: ggplot2
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
fviz_nbclust(
x = personality_clean,
FUNcluster = kmeans,
method = "wss"
)We will be using 8 as the K since the downwards of the wss is sloping, unlike others which are steep.
RNGkind(sample.kind = "Rounding")
set.seed(687)
personality_cluster <- kmeans(x = personality_clean,
centers = 8)personality_cluster$size## [1] 9 7 6 23 8 4 6 5
personality_cluster$centers## Worry.about.things. Make.friends.easily. Have.a.vivid.imagination.
## 1 1.5555556 0.4444444 0.7777778
## 2 -0.2857143 0.4285714 0.7142857
## 3 2.0000000 0.1666667 1.0000000
## 4 -1.1739130 0.7826087 0.8260870
## 5 -1.5000000 0.2500000 0.7500000
## 6 -0.5000000 1.0000000 1.0000000
## 7 -0.8333333 0.8333333 0.1666667
## 8 -0.6000000 1.0000000 0.0000000
## Trust.others Get.angry.easily. Love.large.parties.
## 1 0.0000000 -2.0000000 1.5555556
## 2 2.0000000 -0.2857143 0.7142857
## 3 1.0000000 -1.0000000 1.0000000
## 4 0.5652174 -1.4347826 1.2173913
## 5 0.0000000 -2.0000000 2.0000000
## 6 1.0000000 -1.0000000 1.0000000
## 7 2.0000000 -0.8333333 0.1666667
## 8 2.0000000 -1.0000000 0.0000000
## Would.never.cheat.to.get.ahead. Like.order. Often.feel.unhappy. Take.charge.
## 1 -1 0.0000000 0.4444444 0.0000000
## 2 -2 2.0000000 -1.0000000 2.0000000
## 3 -2 1.0000000 1.0000000 1.0000000
## 4 -1 0.5652174 0.1739130 0.5652174
## 5 -2 0.0000000 1.0000000 0.0000000
## 6 -2 1.0000000 1.0000000 1.0000000
## 7 -2 2.0000000 -1.0000000 2.0000000
## 8 -2 2.0000000 -1.0000000 2.0000000
## Make.people.feel.welcome. Try.to.follow.the.rules. Am.always.busy.
## 1 1.0000000 0.4444444 -1.444444
## 2 0.7142857 2.0000000 -1.714286
## 3 1.0000000 1.0000000 2.000000
## 4 0.9565217 0.7391304 -1.173913
## 5 0.0000000 0.0000000 -1.000000
## 6 1.0000000 1.0000000 2.000000
## 7 0.1666667 2.0000000 -1.166667
## 8 0.0000000 2.0000000 -1.000000
## Am.easy.to.satisfy. Go.straight.for.the.goal. Often.overindulge.
## 1 1.0000000 0.5555556 0.0000000
## 2 0.7142857 0.7142857 2.0000000
## 3 1.0000000 2.0000000 1.0000000
## 4 0.9565217 0.3478261 0.5217391
## 5 0.0000000 2.0000000 0.0000000
## 6 1.0000000 2.0000000 1.0000000
## 7 0.1666667 0.1666667 2.0000000
## 8 0.0000000 0.0000000 2.0000000
## Panic.easily. Avoid.mistakes. Warm.up.quickly.to.others.
## 1 -1.5555556 0.4444444 0.6666667
## 2 -1.7142857 0.8571429 0.5714286
## 3 2.0000000 0.6666667 0.1666667
## 4 -0.9130435 0.8695652 0.2608696
## 5 2.0000000 0.8750000 -0.7500000
## 6 0.5000000 0.0000000 0.7500000
## 7 -1.1666667 1.0000000 -0.3333333
## 8 -1.0000000 0.8000000 -0.2000000
## Get.irritated.easily. Stick.to.the.rules. Try.to.lead.others.
## 1 0.5555556 0.8888889 0.6666667
## 2 -0.7142857 1.0000000 0.7142857
## 3 0.5000000 1.5000000 0.5000000
## 4 0.2173913 1.4347826 0.7826087
## 5 0.8750000 1.7500000 1.0000000
## 6 0.5000000 0.0000000 1.0000000
## 7 -1.0000000 1.6666667 0.6666667
## 8 0.6000000 1.4000000 0.8000000
## Keep.my.promises. Feel.others..emotions Like.to.visit.new.places. Work.hard.
## 1 1.333333 0.7777778 1.777778 0.7777778
## 2 1.571429 1.7142857 1.142857 1.4285714
## 3 1.666667 0.6666667 1.333333 1.3333333
## 4 1.086957 0.6956522 1.086957 1.3913043
## 5 1.625000 1.1250000 1.125000 1.0000000
## 6 1.250000 0.7500000 1.750000 0.7500000
## 7 1.666667 0.6666667 0.000000 2.0000000
## 8 2.000000 1.0000000 1.200000 1.0000000
## Become.overwhelmed.by.events. Choose.my.words.with.care.
## 1 -0.44444444 0.7777778
## 2 -1.28571429 0.4285714
## 3 0.83333333 0.5000000
## 4 0.08695652 0.6521739
## 5 0.75000000 0.2500000
## 6 -0.75000000 0.7500000
## 7 0.00000000 0.3333333
## 8 1.00000000 0.6000000
## I.have.little.to.say. I.am.quiet.around.strangers.
## 1 0.5555556 -1.1111111
## 2 0.8571429 -0.1428571
## 3 0.6666667 -1.3333333
## 4 0.3913043 -0.8695652
## 5 0.7500000 -1.7500000
## 6 0.2500000 -1.0000000
## 7 0.0000000 -2.0000000
## 8 1.0000000 -1.8000000
Interpretation based on centroid:
(+) Strong Points (-) Weak Points
we would take the points of extreme where one cluster is either belong to one pole, based on the treshold of > 1.5
Cluster 1: (+) - Don’t get angry easily - Don’t panic easily - Love large parties - Like to visit new places
(-) - Worry about things
Cluster 2: (+) - Trust others - Like order - Take charge - Try to follow the rules - Not panic easily - Feels other emotions - Keep promise - Don’t cheat to get ahead
(-) - Not always busy - Often overindulge
Cluster 3: (+) - Always busy - Go straight for the goal - Keep promise - Don’t cheat to get ahead
(-) - Panic easily
Cluster 4: (+) - None
(-) - None
Cluster 5: (+) - Don’t get angry easily - Love large parties - Don’t cheat to get ahead - Go straight for the goal - Stick to the rules - Keep promise - Don’t quiet around strangers
(-) - Worry about things - Panic easily
Cluster 6: (+) - Don’t cheat to get ahead - Always busy - Go straight for the goal - Like to visit new places
(-) - None
Cluster 7: (+) - Have a vivid imagination - Love large parties - Don’t cheat to get ahead - Like order - Take charge - Try to follow the rules - Stick to the rules - Hardworker - Don’t quiet around strangers
(-) - Often overindulge
Cluster 8: (+) - Trust others - Don’t cheat to get ahead - Like orders - Take charge - Try to follow the rules - Warm up quickly to others - Keep promise - Don’t Quiet around strangers
(-) - Often overindulge
personality_cluster$iter## [1] 3
personality_cluster$withinss## [1] 76.22222 56.57143 27.83333 212.43478 31.37500 25.00000 22.33333
## [8] 10.80000
personality_cluster$tot.withinss## [1] 462.5701
personality_cluster$betweenss## [1] 793.8711
personality_cluster$totss## [1] 1256.441
personality_cluster$betweenss/personality_cluster$totss## [1] 0.631841
seeing the result of classification based on the spread of the data, the result is not too good since the data shown only 0.63. We can still improve the classification by using another K. based on the elbow method, we could use 9 as the k optimum
RNGkind(sample.kind = "Rounding")
set.seed(687)
personality_cluster2 <- kmeans(x = personality_clean,
centers = 9)personality_cluster2$betweenss/personality_cluster2$totss## [1] 0.6417898
the result of improve model has not shown any significant improvement. we could predict another model using k as we like. However, the classification would not be too good if it was too high as a cluster would not consist of too much observation. for instance:
we will be using 30 as the K
RNGkind(sample.kind = "Rounding")
set.seed(687)
personality_cluster3 <- kmeans(x = personality_clean,
centers = 30)Check how good is the classification based on the spread of the data
personality_cluster3$betweenss/personality_cluster3$totss## [1] 0.9235937
we can see that the model good in clustering the data. However, let’s check how many observations each of cluster have
personality_cluster3$size## [1] 2 2 4 2 1 1 1 4 2 2 2 1 2 2 4 6 2 6 3 1 1 1 3 2 1 2 2 2 3 1
we can see that there are clusters with only one observation, which means that it does not really clustering much of observations. Hence, it only clusters itself.
library(tidyr)
personality_clean$cluster <- as.factor(personality_cluster2$cluster)
personality_centroid <- personality_clean |>
group_by(cluster) |>
summarise_all(mean)
personality_centroid %>%
pivot_longer(-cluster) %>%
ggplot(aes(x = cluster, y = value, fill = cluster)) +
geom_col() +
facet_wrap(~name)library(ggiraphExtra)ggRadar(data=personality_clean,
aes(colour=cluster),
interactive=TRUE,
legend.position = "right")