Week 2: Principles of data visualisationWeek 3: Grammar of graphics; aesthetics and attributes- Week 4: Major visualisation tools
- Week 5: Customising visualisations (scales, themes, and labels)
ggplot2
.Download exercises from Week 4 folder on NOW and move them into your R-project directory.
From the top of your head, what types of data visualisations have you seen in journal articles / news / social media etc.?
Basic Charts
Statistical Visualisations
Multivariate Visualisations
Geospatial Visualisations
Hierarchical & Network Visualisations
Temporal Visualisations
Specialised Visualisations
Decision tree from data-to-viz.com/
Think of it like this:
geom_
) control visual encoding of aesthetics layergeom_
are part of ggplot2
(below)geoms
in other packages such as ggdist
, ggbeeswarm
, and ggridges
[1] abline area bar bin_2d [5] bin2d blank boxplot col [9] column contour contour_filled count [13] crossbar curve density density_2d [17] density_2d_filled density2d density2d_filled dotplot [21] errorbar errorbarh freqpoly function [25] hex histogram hline jitter [29] label line linerange map [33] path point pointrange polygon [37] qq qq_line quantile raster [41] rect ribbon rug segment [45] sf sf_label sf_text smooth [49] spoke step text tile [53] violin vline
Geometry | Function | What it shows |
---|---|---|
geom_point() |
Scatter plot | Relationship between two variables |
geom_line() |
Line plot | Trends over time or ordered categories |
geom_bar() |
Bar chart | Counts or values for categories |
geom_histogram() |
Histogram | Distribution of a single variable |
geom_boxplot() |
Boxplot | Summary of distribution (median, quartiles, outliers) |
geom_text() |
Text labels | Add labels to points or bars |
ggplot(d_spellname, aes(x = rt)) + geom_histogram()
Complete RMarkdown document 1_univariate_viz.Rmd
Any ggplot2
object can easily be transformed into an interactive visualisation using the plotly
package.
# Load plotly package library(plotly) # Create ggplot plot and save in `plot` plot <- ggplot(d_spellname, aes(x = dur, y = rt)) + geom_point() # Create an interactive version of `plot` using `ggplotly` ggplotly(plot)
Complete RMarkdown document 2_bivariate_viz.Rmd
What problems can you think of?
What’s possibly problematic with this visualisation?
Visualisation suggests \(\dots\)
Now watch what happens to the y-axis when we use dots.
See Tukey (1977)
library(ggdist) ggplot(d_spellname, aes(x = modality, y = rt)) + # half violin (the "cloud") stat_halfeye( adjust = 0.45, # smoothness justification = -0.2, # shift left/right .width = 0, # no interval bars point_colour = NA) + # boxplot geom_boxplot( width = 0.15, outlier.shape = NA, alpha = 0.5) + # jittered points (the "rain") geom_jitter( width = 0.1, alpha = 0.5, size = 1) + coord_flip() # horizontal orientation
Complete RMarkdown document 3_group_comparisons.Rmd
You’ve already seen these:
# Count number of identical observations count(data, group) # Calculate descriptive statistics summarise(data, mean = mean(rt)) # Remove rows with missing data (i.e. NA) drop_na()
These wrangling tasks can be managed in ggplot
and tidyverse
, more specifically dplyr
.
dplyr
has many useful functions for data wrangling.# Transforms dataframes into a long format pivot_longer(data, cols) # Transforms dataframes into a wide format pivot_wider(data, names_from, values_from) # Selects and removes variables select(data, var1, var2) # Retains and removes observations filter(data, condition) # Creates new variables mutate(data, new_var = old_var)
Complete RMarkdown document 4_data_wrangling.Rmd
Data visualisation with ggplot2
Data wrangling with dplyr
/ tidyverse
Andrews, Mark. 2021. Doing Data Science in R: An Introduction for Social Scientists. SAGE Publications Ltd.
Tukey, John W. 1977. Exploratory Data Analysis. Vol. 2.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer.
Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, Inc.