This week’s coding goals

The goal this week was to work through the ‘Data Visualization’ series by Professor Navarro and to become familiar with plotting and formatting data in graphs.

Achieving the goals

To do this I’ve worked through the exercises provided by Professor Navarro in her lectures. For example, translating the emojis back into the correct code to print this dinosaur picture.

First to load the data

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.1.2     ✓ dplyr   1.0.6
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Then to fix the code and plot the picture

dino <- read_csv("data_dino.csv")
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   horizontal = col_double(),
##   vertical = col_double()
## )
print(dino)
## # A tibble: 142 x 2
##    horizontal vertical
##         <dbl>    <dbl>
##  1       55.4     97.2
##  2       51.5     96.0
##  3       46.2     94.5
##  4       42.8     91.4
##  5       40.8     88.3
##  6       38.7     84.9
##  7       35.6     79.9
##  8       33.1     77.6
##  9       29.0     74.5
## 10       26.2     71.4
## # … with 132 more rows
picture <- ggplot(data = dino) + geom_point(mapping = aes(x = horizontal, y = vertical))

plot(picture)

For further practice I went to plot and format the handwriting data set provided in the lecture without referring to my notes.

Writing the code and shortening named arguments into unnamed arguments

forensic <- read_csv("data_forensic.csv")
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   participant = col_double(),
##   handwriting_expert = col_character(),
##   us = col_character(),
##   condition = col_character(),
##   age = col_double(),
##   forensic_scientist = col_character(),
##   forensic_specialty = col_character(),
##   handwriting_reports = col_double(),
##   confidence = col_double(),
##   familiarity = col_double(),
##   feature = col_character(),
##   est = col_double(),
##   true = col_double(),
##   band = col_character()
## )
print(forensic)
## # A tibble: 5,700 x 14
##    participant handwriting_expert us     condition         age forensic_scienti…
##          <dbl> <chr>              <chr>  <chr>           <dbl> <chr>            
##  1           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
##  2           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
##  3           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
##  4           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
##  5           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
##  6           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
##  7           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
##  8           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
##  9           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
## 10           1 HW Expert          Non-US Non-US HW Expe…    52 Yes              
## # … with 5,690 more rows, and 8 more variables: forensic_specialty <chr>,
## #   handwriting_reports <dbl>, confidence <dbl>, familiarity <dbl>,
## #   feature <chr>, est <dbl>, true <dbl>, band <chr>
picture <- ggplot(forensic) + 
  geom_boxplot(aes(band, est, fill = band)) +
  facet_wrap(vars(handwriting_expert)) + 
  theme_get() + 
  scale_x_discrete(name = NULL, labels = NULL) + 
  scale_y_continuous(name = "Estimate") + ggtitle(label = "Experts vs novices in handwriting feature probability estimate") + scale_fill_viridis_d()

plot(picture)
## Warning: Removed 4 rows containing non-finite values (stat_boxplot).

Challenges and successes

Compared to last week, this week was more complicated. I had some trouble remembering the order of adding the functions and aesthetics. I also had to do some revision on how to export and import data sets in order to insert the dinosaur picture and its respective code above. I also struggled to remember all the code for the functions taught in the lectures so even though I’ve shown the finished graph for the forensic data set (above), in actuality I had to refer back to my notes several times. For example, I placed whitespace into wrong parts of the code and had to spend a long time figuring out why there was an error in the output.

Despite the struggles, I had lots of fun playing around with the aesthetics!

The next stage

For next week, I aim to work through the last series on R. I want to make sure that I stay on top of the lectures by working through it throughout the week as it is quite long.