Dans le cadre de représentations heatmaps de variables standardisées, il peut s’avérer intéressant de pouvoir discrétiser la distribution globale de ces dernières.
library(tidyverse)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag(): dplyr, stats
library(gplots) # pour faire des heatmaps
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess
library(classInt) # pour discrétiser
mtcars <- mtcars %>%
select(cyl, gear, carb)
head(mtcars)
## cyl gear carb
## Mazda RX4 6 4 4
## Mazda RX4 Wag 6 4 4
## Datsun 710 4 4 1
## Hornet 4 Drive 6 3 1
## Hornet Sportabout 8 3 2
## Valiant 6 3 1
Création d’une palette de couleur (pour la visualisation sous forme de heatmap)
my_palette <- colorRampPalette(c("#018571", "#f5f5f5", "#a6611a"))(n = 5)
mtcars_stand <- mtcars %>%
scale(center = T, scale = T)
heatmap.2(mtcars_stand, Rowv = T, Colv = T,
dendrogram = "both",
col = my_palette,
denscol = "black",
trace = "none")
On va pouvoir étudier la forme de la distribution globale des variables. Pour cela, il faut d’abord transformer le tableau en format long.
FormeDistribution <- mtcars_stand %>%
as.data.frame() %>%
gather(key = TypeData, value = ValStandar, cyl:carb)
head(FormeDistribution)
## TypeData ValStandar
## 1 cyl -0.1049878
## 2 cyl -0.1049878
## 3 cyl -1.2248578
## 4 cyl -0.1049878
## 5 cyl 1.0148821
## 6 cyl -0.1049878
ggplot(FormeDistribution, aes(ValStandar)) +
geom_histogram(binwidth = 0.25)
ggplot(FormeDistribution, aes("ValStandar", ValStandar)) +
geom_boxplot()
Jenks <- classIntervals(FormeDistribution$ValStandar, n = 5, style = "jenks")
Jenks
## style: jenks
## one of 330 possible partitions of this variable into 5 classes
## [-1.224858,-0.9318192] (-0.9318192,-0.1049878] (-0.1049878,0.4235542]
## 33 17 15
## (0.4235542,1.014882] (1.014882,3.211677]
## 24 7
heatmap.2(mtcars_stand, Rowv = T, Colv = T,
dendrogram = "both",
col = my_palette,
breaks = Jenks$brks, # breaks selon la discrétisation de Jenks
denscol = "black",
trace = "none")