Hello to all interested! It is the third episode of the fantastic Viz Quiz series. This time we are going to take a look at a graph showing distribution of the index of academic motivation reflecting the degree of relative autonomy across two waves of longitudinal survey.
So, here is the graph:
‘Anything wrong?’
Well, the plot is neat and informative. Yet, it misses something that is required for such graph to be present - a unified y-scale. It is necessary for comparison beween two distributions of the index to be possible. And on this visualization the y-scale has different intervals of values which makes it quite hard to compare the distributions. Right now, it looks as thet are not intersecting in their index values’ range, which is not true.
Improvements
So my first impovement would be about the y-scale. Another thing I would upgrade is the coloring. Yet, this part is not crucial, it’s just me and my addiction to colors. I would also place the distributions in the same “window” to make it a little bit more comprehensible. It would not be a violation to leave two windows as on the original picture, though.
So heremy version goes:
library(ggplot2)
library(dplyr)
wave <- c("First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "First wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave", "Second wave") %>% as.factor()
share <- c(0.004, 0.004, 0, 0.004, 0.008, 0.008, 0.004, 0.008, 0.0125, 0.025, 0.033, 0.0375, 0.054, 0.08, 0.11, 0.14, 0.175, 0.19, 0.16, 0.075, 0.051, 0.025, 0.0125, 0.01, 0.01, 0.005, 0.005, 0.008, 0.008, 0.01, 0.01, 0.05, 0.05, 0.29, 0.035, 0.035, 0.005, 0.005, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001) %>% as.numeric()
index <- c(-17, -16, -15, -14, -13, -12, -11,-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 8, 10, 12, 18) %>% as.numeric()
data <- data.frame(wave, share, index)
ggplot(data = data, aes(x = index, y = share, fill = wave))+
geom_histogram(stat = "identity", position = "identity", alpha = 0.7) +
#geom_freqpoly(stat = "identity", position = "identity", size = 1)+
theme_bw() +
labs(
x = "Relations Autonomy Index",
y = "Share of Students",
title = "Distribution of the Index of Academic Motivation Reflecting the\nDegree of Relative Autonomy",
fill = "",
subtitle = "1st and 2nd rounds of the longitudinal survey"
) +
scale_x_continuous(breaks = c(-20, -15, -10, -5, 0, 5, 10, 15, 20)) +
scale_fill_manual(values = c("red", "darkturquoise"))Also, I personally get the histograms better when they are transformed into density plots. So here is a further and the last modification.
ggplot(data = data, aes(x = index, y = share, col = wave))+
geom_freqpoly(stat = "identity", position = "identity", size = 1, alpha = 0.9)+
theme_bw() +
labs(
x = "Relations Autonomy Index",
y = "Share of Students",
title = "Distribution of the Index of Academic Motivation Reflecting the\nDegree of Relative Autonomy",
col = "",
subtitle = "1st and 2nd rounds of the longitudinal survey"
) +
scale_x_continuous(breaks = c(-20, -15, -10, -5, 0, 5, 10, 15, 20)) +
scale_color_manual(values = c("red", "darkturquoise"))It looks a bit more cracked and less beautiful, though. Perhaps, I should have stopped on histograms.
That’s it! Hope you enjoyed.