Week 2: Principles of data visualisationWeek 3: Grammar of graphics; aesthetics and attributes- Week 4: Major visualisation tools
- Week 5: Customising visualisations (scales, themes, and labels)
ggplot2.Download exercises from Week 4 folder on NOW and move them into your R-project directory.
From the top of your head, what types of data visualisations have you seen in journal articles / news / social media etc.?
Basic Charts
Statistical Visualisations
Multivariate Visualisations
Geospatial Visualisations
Hierarchical & Network Visualisations
Temporal Visualisations
Specialised Visualisations
Decision tree from data-to-viz.com/
Think of it like this:
geom_) control visual encoding of aesthetics layergeom_ are part of ggplot2 (below)geoms in other packages such as ggdist, ggbeeswarm, and ggridges[1] abline area bar bin_2d [5] bin2d blank boxplot col [9] column contour contour_filled count [13] crossbar curve density density_2d [17] density_2d_filled density2d density2d_filled dotplot [21] errorbar errorbarh freqpoly function [25] hex histogram hline jitter [29] label line linerange map [33] path point pointrange polygon [37] qq qq_line quantile raster [41] rect ribbon rug segment [45] sf sf_label sf_text smooth [49] spoke step text tile [53] violin vline
| Geometry | Function | What it shows |
|---|---|---|
geom_point() |
Scatter plot | Relationship between two variables |
geom_line() |
Line plot | Trends over time or ordered categories |
geom_bar() |
Bar chart | Counts or values for categories |
geom_histogram() |
Histogram | Distribution of a single variable |
geom_boxplot() |
Boxplot | Summary of distribution (median, quartiles, outliers) |
geom_text() |
Text labels | Add labels to points or bars |
ggplot(d_spellname, aes(x = rt)) + geom_histogram()
Complete RMarkdown document 1_univariate_viz.Rmd
Any ggplot2 object can easily be transformed into an interactive visualisation using the plotly package.
# Load plotly package library(plotly) # Create ggplot plot and save in `plot` plot <- ggplot(d_spellname, aes(x = dur, y = rt)) + geom_point() # Create an interactive version of `plot` using `ggplotly` ggplotly(plot)
Complete RMarkdown document 2_bivariate_viz.Rmd
What problems can you think of?
What’s possibly problematic with this visualisation?
Visualisation suggests \(\dots\)
Now watch what happens to the y-axis when we use dots.
See Tukey (1977)
library(ggdist)
ggplot(d_spellname, aes(x = modality, y = rt)) +
# half violin (the "cloud")
stat_halfeye(
adjust = 0.45, # smoothness
justification = -0.2, # shift left/right
.width = 0, # no interval bars
point_colour = NA) +
# boxplot
geom_boxplot(
width = 0.15,
outlier.shape = NA,
alpha = 0.5) +
# jittered points (the "rain")
geom_jitter(
width = 0.1,
alpha = 0.5,
size = 1) +
coord_flip() # horizontal orientation
Complete RMarkdown document 3_group_comparisons.Rmd
You’ve already seen these:
# Count number of identical observations count(data, group) # Calculate descriptive statistics summarise(data, mean = mean(rt)) # Remove rows with missing data (i.e. NA) drop_na()
These wrangling tasks can be managed in ggplot and tidyverse, more specifically dplyr.
dplyr has many useful functions for data wrangling.# Transforms dataframes into a long format pivot_longer(data, cols) # Transforms dataframes into a wide format pivot_wider(data, names_from, values_from) # Selects and removes variables select(data, var1, var2) # Retains and removes observations filter(data, condition) # Creates new variables mutate(data, new_var = old_var)
Complete RMarkdown document 4_data_wrangling.Rmd
Data visualisation with ggplot2
Data wrangling with dplyr / tidyverse
Andrews, M. (2021). Doing data science in R: An introduction for Social Scientists. SAGE Publications Ltd.
Tukey, J. W. (1977). Exploratory data analysis (Vol. 2).
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.
Wickham, H., & Grolemund, G. (2016). R for data science: Import, tidy, transform, visualize, and model data. O’Reilly Media, Inc.