Outline

  • principles of data visualisation
  • grammar of graphics
  • aesthetics and attributes
  • geometries
  • major tools of data visualisation
  • cosmetics

What is data visualisation?

  • graphical representation of data
  • graphical data analysis (stats): what do we want to know?
  • communication and perception (design): what do we want to communicate?
  • exploratory plots: get to know data (small specialist audience)
  • explanatory plots: inform and persuade (wide audience)
  • think about your audience

Exploring data

blomkvist <- read_csv("../data/blomkvist.csv")
glimpse(blomkvist)
Rows: 354
Columns: 11
$ id          <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
$ sex         <chr> "male", "female", "female", "female", "male", "male", "fem…
$ age         <dbl> 84, 37, 62, 85, 73, 65, 30, 49, 83, 58, 25, 88, 62, 88, 27…
$ medicine    <dbl> 8, 1, 0, 4, 5, 0, 0, 0, 11, 0, 0, 4, 3, 8, 1, 3, 4, 1, 1, …
$ smoker      <chr> "former", "no", "yes", "former", "former", "no", "no", "fo…
$ pal_work    <dbl> NA, 2, NA, NA, NA, 1, 3, 1, NA, 4, 2, NA, 3, NA, 2, 3, NA,…
$ pal_leisure <dbl> 1, 2, 2, 2, 3, 3, 2, 2, 1, 3, 3, 2, 1, 1, 3, 3, 1, 3, 1, 2…
$ rt_hand_d   <dbl> 701.6667, 470.6667, 638.6667, 708.0000, 607.3333, 541.6667…
$ rt_hand_nd  <dbl> 780.3333, 497.0000, 638.0000, 638.6667, 652.0000, 498.6667…
$ rt_foot_d   <dbl> 1009.0000, 737.6667, 878.0000, 902.3333, 923.0000, 686.666…
$ rt_foot_nd  <dbl> 962.6667, 692.3333, 786.0000, 1373.6667, 805.0000, 599.666…

Building up a plot

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d)) 

Building up a plot

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d)) +
  geom_point()  

Building up a plot

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d)) +
  geom_point() +
  scale_y_log10()

Building up a plot

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d)) +
  geom_point() +
  scale_y_log10() +
  stat_smooth(method = "lm") 

Building up a plot

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d)) +
  geom_point() +
  scale_y_log10() +
  stat_smooth(method = "lm",
              formula = y ~ x + I(x^2)) 

Building up a plot

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d,
                     colour = smoker)) +
  geom_point() +
  scale_y_log10() +
  stat_smooth(method = "lm",
              formula = y ~ x + I(x^2)) 

Building up a plot

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d, 
                     colour = smoker))  +
  geom_point(alpha = .25) +
  scale_y_log10(labels = scales::comma) +
  stat_smooth(method = "lm", formula = y ~ x + I(x^2), se = FALSE, fullrange = TRUE) +
  ggthemes::theme_clean() +
  ggthemes::scale_color_colorblind() +
  labs(y = "Average reaction time of dominant\nhand (in msecs)", 
       x = "Age (in years)",
       caption = "Data from Blomkvist et al. (2017)",
       colour = "Smoker") +
  theme(legend.position = "top",
        legend.justification = "right",
        axis.title = element_text(hjust = 0))

Explanatory plot

Why data visualisation?

“[data visualization] forces us to notice what we never expected to see.” (Tukey 1977)

  • exploring structures in the data
  • relationship between variables
  • distribution of data
  • develop an understanding of patterns (beyond descriptives)
  • selecting appropriate stats

Anscombe’s quartet

Anscombe (1973) and Tufte (1989)

x
y
y ~ x
Data set Mean SD Mean SD Correlation Intercept Slope
1 9 3.32 7.5 2.03 0.82 3 0.5
2 9 3.32 7.5 2.03 0.82 3 0.5
3 9 3.32 7.5 2.03 0.82 3 0.5
4 9 3.32 7.5 2.03 0.82 3 0.5

Anscombe’s quartet

Anscombe’s quartet

The datasaurus dozen

Matejka and Fitzmaurice (2017): see link

Principles of data visualisation

  • no “one fits all” method
  • some methods are more informative than others
  • maximise what we can learn from data
  • going beyond summary statistics
  • descriptive summary statistics may conceal / obscure important patterns
  • visualisation helps us to understand patterns, structures, relationships
  • prevent wrong conclusions about data / theory

Basic principles

Hartwig and Dearing (1979):

  • skepticism: any visualization might obscure or misrepresent data
  • openness: there might be patterns and structures that we were not expecting

Tufte (1983):

  • above all else show the data
  • avoid distorting what the data have to say
  • present many numbers in a small space
  • encourage the eye to compare different pieces of data
  • reveal data at several levels of detail, from broad overview to fine structures

Exercise 1

creating scatterplots in R

Open script exercises/1_scatterplots.R

Grammar of graphics

Grammar of graphics

  • “gg” in ggplot2 refers to grammar of graphics (Wickham 2016, 2010)
  • framework for data visualisation
  • higher-level plotting system compared to base R functions (e.g. plot(), hist())
  • complex visualisations can be creased with a minimal amount of code
  • integration of statistical information
  • base R is great for quick and basic plots but is limited

Grammar of graphics

Wilkinson (1999)

  • property 1: graphics consist of distinct layers of grammatical elements (data, aesthetics, geometries)
  • property 2: graphics are built around mappings that determine how data, aesthetics and geometries are combined.
  • e.g. similar ingredients (1) can be combined following different recipe (2)
  • grammatical elements are organised as layers
  • underlying grammar controls how graphics are combined
  • system of rules for mapping variables to graphical properties

Obligatory grammatical elements

  • data: the data you want to visualise indicated as ggplot(data = ...)
  • aesthetics: mapping of data to graphic properties (axes, size, colour) indicated as mapping = aes()
  • geometries: visual elements encoding the data indicated as geom_...()

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d, 
                     colour = smoker))

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d, 
                     colour = smoker)) +
  geom_point()

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d, 
                     colour = smoker)) +
  geom_quantile()

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d, 
                     colour = smoker)) +
  geom_rug()

ggplot(data = blomkvist, 
       mapping = aes(x = age, 
                     y = rt_hand_d, 
                     colour = smoker)) +
  geom_point() +
  geom_quantile() +
  geom_rug()

Optional grammatical elements

  • facets: dividing data into subplots
  • statistics: summarising representations
  • coordinates: plotting space
  • theme: visual properties not related to the data (font, background)

  • data
  • aesthetics
  • geometries

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker)) +
  geom_point()

  • data
  • aesthetics
  • geometries
  • facets

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker)) +
  geom_point() +
  facet_grid(~sex)

  • data
  • aesthetics
  • geometries
  • facets
  • statistics

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker)) +
  geom_point() +
  stat_smooth(method = "lm", se = FALSE, fullrange = TRUE) 

  • data
  • aesthetics
  • geometries
  • facets
  • statistics
  • coordinates

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker)) +
  geom_point() +
  coord_trans(x = "log", y = "reverse")

  • data
  • aesthetics
  • geometries
  • facets
  • statistics
  • coordinates

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker)) +
  geom_point() +
  coord_flip()

  • data
  • aesthetics
  • geometries
  • facets
  • statistics
  • coordinates
  • theme

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker)) +
  geom_point() +
  theme_dark()

  • data
  • aesthetics
  • geometries
  • facets
  • statistics
  • coordinates
  • theme

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker)) +
  geom_point() +
  theme(panel.background = element_blank())

Exercise 2

grammatical elements in action

Open script exercises/2a_grammar_of_graphics.R

Bonus: exercises/2b_grammar_of_graphics.R

Aesthetics and attributes

Aesthetics and attributes

  • appearance of geometries
  • e.g. colour, size, shape
  • attributes take properties
  • aesthetics take variables

ggplot(blomkvist, aes(x = age, y = rt_hand_d)) +
  geom_point(colour = "red")

Aesthetics and attributes

  • appearance of geometries
  • e.g. colour, size, shape
  • attributes take properties
  • aesthetics take variables

ggplot(blomkvist, aes(x = age, y = rt_hand_d)) +
  geom_point(aes(colour = smoker))

Aesthetics and attributes

  • appearance of geometries
  • e.g. colour, size, shape
  • attributes take properties
  • aesthetics take variables

ggplot(blomkvist, aes(x = age, y = rt_hand_d)) +
  geom_point(aes(colour = smoker)) +
  stat_smooth(method = "lm")

Aesthetics and attributes

  • appearance of geometries
  • e.g. colour, size, shape
  • attributes take properties
  • aesthetics take variables

ggplot(blomkvist, aes(x = age, y = rt_hand_d)) +
  geom_point() +
  stat_smooth(aes(colour = smoker), method = "lm")

Aesthetics and attributes

  • appearance of geometries
  • e.g. colour, size, shape
  • attributes take properties
  • aesthetics take variables

ggplot(blomkvist, aes(x = age, y = rt_hand_d)) +
  geom_point(aes(colour = smoker)) +
  stat_smooth(aes(colour = smoker), method = "lm")

Aesthetics and attributes

  • appearance of geometries
  • e.g. colour, size, shape
  • attributes take properties
  • aesthetics take variables

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker)) +
  geom_point() +
  stat_smooth(method = "lm")

Aesthetics and attributes

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker)) +
  geom_point(size = 3) 

Aesthetics and attributes

ggplot(blomkvist, aes(x = age, y = rt_hand_d, shape = smoker)) +
  geom_point(size = 3)

Aesthetics and attributes

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker, shape = smoker))  +
  geom_point(size = 3)

Aesthetics and attributes

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = smoker, shape = sex))  +
  geom_point(size = 3)

Aesthetics

typically x, y, colour, fill, size, alpha, linetype, labels

  • some are required by geometries; others are optional
  • continuous vs discrete variables:
    • e.g. shape and label can only be used for categorical values
  • should be chosen to facilitate comprehension
  • scatterplot: geom_point()
x, y, shape, colour, size, fill, alpha, stroke, group
  • barplot: geom_bar()
x, y, colour, fill, size, linetype, alpha, group
  • boxplot: geom_boxplot()
x, y, lower, xlower, upper, xupper, middle, xmiddle, ymin, 
xmin, ymax, xmax, weight, colour, fill, size, alpha, shape, 
linetype, group

Decoding of continuous variables (e.g. rt)

(Wong 2010, 665)

  • position on a common scale
  • position on the same but nonaligned scales
  • lengths
  • angles, slopes
  • areas
  • volume, monochromatic colour spectrum (saturation, grey scale)
  • pure spectrum colours

Decoding of continuous variables

position on common scale

ggplot(blomkvist, 
       aes(x = smoker, 
           y = rt_hand_d)) +
  geom_jitter() 

Decoding of continuous variables

position on non aligned scale

ggplot(blomkvist, 
       aes(x = smoker, 
           y = rt_hand_d)) +
  geom_jitter() +
  facet_wrap(~smoker, scales = "free") 

Decoding of continuous variables

area (size)

ggplot(blomkvist, 
       aes(x = smoker, 
           y = 0, 
           size = rt_hand_d)) +
  geom_jitter() 

Decoding of continuous variables

colour spectrum

ggplot(blomkvist, 
       aes(x = smoker, 
           y = 0, 
           colour = rt_hand_d)) +
  geom_jitter() 

Decoding of categorical variables (groups)

(Wong 2010, 665)

  • qualitative colours, labels, line colours
  • sequential colours, shape outlines, line type
  • filled shapes, hatching (shading with lines), line width

Decoding of categorical variables (groups)

ggplot(blomkvist, aes(x = age, y = rt_hand_d, label = sex)) +
  geom_text(size = 3)

  • qualitative colours, labels, line colours
  • sequential colours, shape outlines, line type
  • filled shapes, hatching (shading with lines), line width

Decoding of categorical variables (groups)

ggplot(blomkvist, aes(x = age, y = rt_hand_d, shape = sex)) +
  geom_point(size = 3)

  • qualitative colours, labels, line colours
  • sequential colours, shape outlines, line type
  • filled shapes, hatching (shading with lines), line width

Decoding of categorical variables (groups)

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = sex)) +
  geom_point(size = 3)

  • qualitative colours, labels, line colours
  • sequential colours, shape outlines, line type
  • filled shapes, hatching (shading with lines), line width

Decoding of categorical variables (groups)

ggplot(blomkvist, aes(x = age, y = rt_hand_d, colour = sex))  +
  stat_smooth(method = "lm", se = F)

  • qualitative colours, labels, line colours
  • sequential colours, shape outlines, line type
  • filled shapes, hatching (shading with lines), line width

Decoding of categorical variables (groups)

ggplot(blomkvist, aes(x = age, y = rt_hand_d, linetype = sex)) +
  stat_smooth(method = "lm", se = F)

  • qualitative colours, labels, line colours
  • sequential colours, shape outlines, line type
  • filled shapes, hatching (shading with lines), line width

Decoding of categorical variables (groups)

ggplot(blomkvist, aes(x = age, y = rt_hand_d, size = sex)) +
  stat_smooth(method = "lm", se = F)

  • qualitative colours, labels, line colours
  • sequential colours, shape outlines, line type
  • filled shapes, hatching (shading with lines), line width

Exercise 3

practice aesthetics and attributes

Open script exercises/3a_aesthetics_and_attributes.R

If you have time continue with

  • exercises/3b_aesthetics_and_attributes.R
  • exercises/3c_aesthetics_and_attributes.R

Major visualisation tools

Major visualisation tools

  • Geometries (geom_) control visual encoding of aesthetics layer
  • ~50 geometries: geom_... are part of ggplot2
 [1] abline            area              bar               bin_2d           
 [5] bin2d             blank             boxplot           col              
 [9] column            contour           contour_filled    count            
[13] crossbar          curve             density           density_2d       
[17] density_2d_filled density2d         density2d_filled  dotplot          
[21] errorbar          errorbarh         freqpoly          function         
[25] hex               histogram         hline             jitter           
[29] label             line              linerange         map              
[33] path              point             pointrange        polygon          
[37] qq                qq_line           quantile          raster           
[41] rect              ribbon            rug               segment          
[45] sf                sf_label          sf_text           smooth           
[49] spoke             step              text              tile             
[53] violin            vline            

Major visualisation tools

  • more geoms in other packages such as tidybayes, ggbeeswarm, and ggridges
  • choice depends on visualisation goal (and your subject domain)
  • many can be combined
  • three important groups:
    • bivariate distributions
    • univariate distributions
    • group comparisons

Bivariate distribution

  • function: relationship between two variables
  • variable type: typically continuous
  • examples: scatter plot, time series

Univariate distribution

  • function: distribution of values
  • variable type: continuous or discrete
  • examples: histograms, density plots, bar plots, rug

ggplot(blomkvist, aes(x = rt_hand_d)) +
  geom_histogram() 

Univariate distribution

  • function: distribution of values
  • variable type: continuous or discrete
  • examples: histograms, density plots, bar plots, rug

ggplot(blomkvist, aes(x = rt_hand_d)) +
  geom_density() 

Univariate distribution

  • function: distribution of values
  • variable type: continuous or discrete
  • examples: histograms, density plots, bar plots, rug

ggplot(blomkvist, aes(x = rt_hand_d)) +
  geom_density() +
  geom_rug()

Group comparisons

  • function: distribution of values for two or more groups (often closely tied to statistical descriptions)
  • variable type: continuous
  • examples: (jitter) dots, box plot, violin plot, beeswarm plots, barplot (pie chart), dynamite plots

Group comparisons

dynamite plot and its pitfalls

  • suggest normal distribution?
  • same number of observations in each group?
  • bars suggest data where there are none?
  • are there no values above the errorbar (watch what’s going to happen to the y-axis)?

Group comparisons

dynamite plots

Group comparisons

dots

Group comparisons

jittered dots

Group comparisons

jittered dots and errorbars

Group comparisons

box-and-whiskers plot

Group comparisons

box-and-whiskers plot

Group comparisons

box-and-whiskers plot (Tukey 1977)

Exercise 4

major visualisation tools

Open script exercises/4a_major_viz_tools.R

Continue with exercises/4b_major_viz_tools.R

Cosmetics

Changing text: labs

  • title
  • subtitle
  • caption
  • tag
  • x
  • y
  • colour, shape, linetype, fill

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() + 
  labs()

Changing text: labs

  • title
  • subtitle
  • caption
  • tag
  • x
  • y
  • colour, shape, linetype, fill

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() + 
  labs(title = "My scatter plot")

Changing text: labs

  • title
  • subtitle
  • caption
  • tag
  • x
  • y
  • colour, shape, linetype, fill

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() + 
  labs(title = "My scatter plot", 
       subtitle = "I'm a subtitle")

Changing text: labs

  • title
  • subtitle
  • caption
  • tag
  • x
  • y
  • colour, shape, linetype, fill

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() + 
  labs(caption = "Caption for data source")

Changing text: labs

  • title
  • subtitle
  • caption
  • tag
  • x
  • y
  • colour, shape, linetype, fill

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() + 
  labs(tag = "A")

Changing text: labs

  • title
  • subtitle
  • caption
  • tag
  • x
  • y
  • colour, shape, linetype, fill

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() + 
  labs(x = "Age in years", 
       y = "Reaction time in msecs")

Changing text: labs

  • title
  • subtitle
  • caption
  • tag
  • x
  • y
  • colour, shape, linetype, fill

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() + 
  labs(colour = "Legend\ntitle:")

Changing text: legend keys

  • scale_colour_discrete
  • scale_colour_continuous
  • scale_colour_manual
  • or any other aesthetic instead of colour

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() +  
  scale_colour_discrete(
    labels = c("ex-smoker", "non-smoker", "smoker")) 

Changing text: legend keys

  • change colour values manually
  • colour names: link
  • ggthemes

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() +  
  scale_colour_manual(
    labels = c("ex-smoker", "non-smoker", "smoker"),
    values = c("firebrick", "turquoise2", "cornflowerblue"))

Changing text: legend keys

  • change colour values manually
  • colour names: link
  • ggthemes

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() +  
  scale_colour_manual(
    labels = c("ex-smoker", "non-smoker", "smoker"),
    values = c("firebrick", "turquoise2", "cornflowerblue"))

Changing text: legend keys

  • change colour values manually
  • colour names: link
  • ggthemes
# RGB codes of "colorblind" function
mycolours <- c("#000000", "#E69F00", "#56B4E9", "#009E73", 
               "#F0E442", "#0072B2", "#D55E00", "#CC79A7")

# RGB codes of "colorblind" function
scales::show_col(colorblind_pal()(8))

Changing text: legend keys

  • change colour values manually
  • colour names: link
  • ggthemes

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() +  
  scale_colour_manual(
    labels = c("ex-smoker", "non-smoker", "smoker"),
    values = mycolours[1:3])

Changing text: legend keys

  • change colour values manually
  • colour names: link
  • ggthemes

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() +  
  scale_colour_colorblind(
    labels = c("ex-smoker", "non-smoker", "smoker"))

Changing text: strips

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  facet_grid(~smoker)

Changing text: strips

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  facet_grid(~smoker, labeller = label_both)

Changing text: strips

blomkvist <- mutate(blomkvist, 
                    smoker = recode(smoker, 
                    "former" = "Ex-smoker",
                    "no" = "Non-smoker",
                    "yes" = "Smoker"))

Changing text: strips

blomkvist <- mutate(blomkvist, 
                    smoker = recode(smoker, 
                    "former" = "Ex-smoker",
                    "no" = "Non-smoker",
                    "yes" = "Smoker"))

Themes

  • specify appearance of non-data related ink
  • can be done manually using themes() or using wrapper functions
  • All ggplot wrappers:
[1] "theme_bw"       "theme_classic"  "theme_dark"     "theme_grey"    
[5] "theme_light"    "theme_linedraw" "theme_minimal"  "theme_void"    
  • e.g. ggthemes for more themes:
 [1] "theme_base"            "theme_calc"            "theme_clean"          
 [4] "theme_economist"       "theme_economist_white" "theme_excel"          
 [7] "theme_excel_new"       "theme_few"             "theme_fivethirtyeight"
[10] "theme_foundation"      "theme_gdocs"           "theme_hc"             
[13] "theme_map"             "theme_pander"          "theme_par"            
[16] "theme_solarized"       "theme_solarized_2"     "theme_solid"          
[19] "theme_stata"           "theme_stata_base"      "theme_stata_colors"   
[22] "theme_tufte"           "theme_wsj"            

Themes (ggplot default)

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) + 
  geom_point() +
  facet_grid(~smoker) +
  theme_grey(base_size = 11)

Themes

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) + 
  geom_point() +
  facet_grid(~smoker) +
  theme_minimal(base_size = 14)

Themes

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) + 
  geom_point() +
  facet_grid(~smoker) +
  theme_light(base_size = 14)

Themes

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) + 
  geom_point() +
  facet_grid(~smoker) +
  theme_dark(base_size = 14)

Themes

  • axis
  • legend
  • panel
  • plot
  • strip

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) + 
  geom_point() +
  theme()

Themes: axis

  • axis.text
    • axis.text.x
    • axis.text.y
  • axis.title
    • axis.title.x
    • axis.title.y

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) + 
  geom_point() +
  theme(axis.text = element_text(face = "italic"))

Themes: axis

  • axis.text
    • axis.text.x
    • axis.text.y
  • axis.title
    • axis.title.x
    • axis.title.y

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) + 
  geom_point() +
  theme(axis.title = element_text(face = "bold"))

Themes: axis

  • axis.text
    • axis.text.x
    • axis.text.y
  • axis.title
    • axis.title.x
    • axis.title.y

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) + 
  geom_point() +
  theme(axis.title.y = element_text(face = "bold"))

Themes: legend

  • legend.background
  • legend.margin
  • legend.spacing
  • legend.key
  • legend.text
  • legend.title
  • legend.position
  • legend.orientation
  • legend.justification
  • legend.box

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) + 
  geom_point() +
  theme()

Themes: legend

  • legend.background
  • legend.margin
  • legend.spacing
  • legend.key
  • legend.text
  • legend.title
  • legend.position
  • legend.orientation
  • legend.justification
  • legend.box

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) + 
  geom_point() +
  theme(legend.position = "top")

Themes: legend

  • legend.background
  • legend.margin
  • legend.spacing
  • legend.key
  • legend.text
  • legend.title
  • legend.position
  • legend.orientation
  • legend.justification
  • legend.box

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) + 
  geom_point() +
  theme(legend.position = "top",
        legend.justification = "right")

Themes: legend

  • legend.background
  • legend.margin
  • legend.spacing
  • legend.key
  • legend.text
  • legend.title
  • legend.position
  • legend.orientation
  • legend.justification
  • legend.box

ggplot(blomkvist, aes(y = rt_hand_d, x = age, colour = smoker)) +
  geom_point() + 
  theme(legend.position = c(.15,.8))

Themes: panel

  • panel.background
  • panel.border
  • panel.spacing
  • panel.grid
    • panel.grid.major
    • panel.grid.minor

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  theme()

Themes: panel

  • panel.background
  • panel.border
  • panel.spacing
  • panel.grid
    • panel.grid.major
    • panel.grid.minor

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  theme(panel.background = element_blank())

Themes: plot

  • plot.background
  • plot.margin
  • plot.title
  • plot.subtitle
  • plot.caption
  • plot.tag

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() +
  theme()

Themes: plot

  • plot.background
  • plot.margin
  • plot.title
  • plot.subtitle
  • plot.caption
  • plot.tag

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  theme(plot.background = element_rect(fill = "pink"))

Themes: plot

  • plot.background
  • plot.margin
  • plot.title
  • plot.subtitle
  • plot.caption
  • plot.tag

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  theme(plot.background = element_rect(fill = "pink"),
        plot.margin = unit(c(2,2,2,2), "cm"))

Themes: plot

  • plot.background
  • plot.margin
  • plot.title
  • plot.subtitle
  • plot.caption
  • plot.tag

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  labs(title = "I'm a title") +
  theme(plot.title = element_text(colour = "pink"))

Themes: plot

  • plot.background
  • plot.margin
  • plot.title
  • plot.subtitle
  • plot.caption
  • plot.tag

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  labs(caption = "I'm a caption") +
  theme(plot.caption = element_text(face = "italic"))

Themes: facet strips

  • strip.background
  • strip.placement
  • strip.text

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  facet_grid(~smoker, labeller = label_both) +
  theme()

Themes: strip.background

  • strip.background
  • strip.placement
  • strip.text

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  facet_grid(~smoker, labeller = label_both) +
  theme(strip.background = element_blank())

Themes: strip.background

  • strip.background
  • strip.placement
  • strip.text

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  facet_grid(~smoker, labeller = label_both) +
  theme(strip.background = element_rect(fill = "forestgreen"))

Themes: strip.text

  • strip.background
  • strip.placement
  • strip.text

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  facet_grid(~smoker, labeller = label_both) +
  theme(strip.background = element_rect(fill = "forestgreen"),
        strip.text = element_text(colour = "white", hjust = 0))

Themes: strip.text

  • strip.background
  • strip.placement
  • strip.text

ggplot(blomkvist, aes(y = rt_hand_d, x = age)) +
  geom_point() + 
  facet_grid(~smoker, labeller = label_both) +
  theme(strip.background = element_rect(fill = "forestgreen"),
        strip.text = element_text(colour = "white", hjust = 0, 
                                  face = "bold", size = 16, 
                                  angle = 180))

Saving your plot

ggsave("name of plot.png", width = 5, height = 5)
  • .eps, .pdf, .svg, .wmf, .png, .jpg, .bmp, .tiff
  • sizes requires some manual adjustment
  • make sure fonts are not too small / large
  • keep the aspect ratio sensible
  • or export function in plots panel

Exercise 5

bringing everything together

Open script exercises/5a_bringing_everything_together.R

Continue with exercises/5b_bringing_everything_together.R

Useful resources

References

Anscombe, Francis J. 1973. “Graphs in Statistical Analysis.” The American Statistician 27: 17–21.

Hartwig, Frederick, and Brian E. Dearing. 1979. Exploratory Data Analysis. 16. Sage.

Matejka, Justin, and George Fitzmaurice. 2017. “Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics Through Simulated Annealing.” In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 1290–94.

Tufte, Edward R. 1983. The Visual Display of Information. Cheshire, Ct: Graphics Press.

———. 1989. The Visual Display of Quantitative Information. Vol. 13–14. Graphic Press.

Tukey, John W. 1977. Exploratory Data Analysis. Vol. 2.

Wickham, Hadley. 2010. “A Layered Grammar of Graphics.” Journal of Computational and Graphical Statistics 19 (1): 3–28.

———. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer.

Wilkinson, Leland. 1999. The Grammar of Graphics. Springer.

Wong, Bang. 2010. “Points of View: Design of Data Figures.” Nature Methods 7 (9): 665.