Learning Log 2

Victor Bian

6/14/2021

What were the goals for this week?

This week, I was focused on expanding my knowledge of R through keeping up with the course material. As this week’s content was on use of ggplot to visualize data, several things I looked out for was documentation of ggplot online to assist me with completing this weeks coding workshop.

How did I go about achieving these goals?

This week introduced data visualization through usage of the ggplot visualization package. First, we need to install and load ggplot:

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.1.2     ✓ dplyr   1.0.6
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

We can then display a dataset such as the fuel economy of various car models in the form of a table:

mpg
## # A tibble: 234 x 11
##    manufacturer model    displ  year   cyl trans   drv     cty   hwy fl    class
##    <chr>        <chr>    <dbl> <int> <int> <chr>   <chr> <int> <int> <chr> <chr>
##  1 audi         a4         1.8  1999     4 auto(l… f        18    29 p     comp…
##  2 audi         a4         1.8  1999     4 manual… f        21    29 p     comp…
##  3 audi         a4         2    2008     4 manual… f        20    31 p     comp…
##  4 audi         a4         2    2008     4 auto(a… f        21    30 p     comp…
##  5 audi         a4         2.8  1999     6 auto(l… f        16    26 p     comp…
##  6 audi         a4         2.8  1999     6 manual… f        18    26 p     comp…
##  7 audi         a4         3.1  2008     6 auto(a… f        18    27 p     comp…
##  8 audi         a4 quat…   1.8  1999     4 manual… 4        18    26 p     comp…
##  9 audi         a4 quat…   1.8  1999     4 auto(l… 4        16    25 p     comp…
## 10 audi         a4 quat…   2    2008     4 manual… 4        20    28 p     comp…
## # … with 224 more rows

We can go one step further and make a simple scatterplot:

example <- ggplot(data = mpg) +
  geom_point(mapping = aes(
    x = displ,
    y = hwy
  ))
plot(example)

We can reduce the number of lines of code we use by utilizing unnamed args and global variables.

example <- ggplot(mpg, aes(displ, hwy)) +
  geom_point()
plot(example)

We can make our plot fancier by introducing colours:

example <- ggplot(mpg, aes(displ, hwy, color = cyl)) +
  geom_point()
plot(example)

Making those colours discrete, and adding a regression line:

example <- ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = factor(cyl))) +
  geom_smooth()
plot(example)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

If we want to, we could display the data as a boxplot as well:

example <- ggplot(mpg, aes(trans, hwy)) +
  geom_boxplot()
plot(example)

We can also change the labels on the graph and fancy it up with a theme:

example <- ggplot(mpg, aes(trans, hwy)) +
  geom_boxplot() +
  theme_minimal() +
  scale_x_discrete("Transmission type") +
  scale_y_discrete("Highway miles per gallon") +
  ggtitle("My fancy plot")
plot(example)

Overall, this weeks workshop was very interesting and didn’t present too much difficulty whilst introducing lots of useful content! The last part of the workshop was also interesting, with the consideration that most scientists/statisticians aren’t the greatest graphic designers, whilst the design of our visualizations is one of the most important elements in terms of conveying information simply, quickly, and attractively.

What are the next steps?

My goals for the future are to continue to find new ways of ggplot by exploring the documentation online at https://ggplot2.tidyverse.org/. I also want to keep working towards tidy, minimal code through the use of unnamed args and global variables, whilst utilizing whitespace appropriately to ensure aesthetic and readable code. One thing I need to work on is commenting on my code more often, in order to decrease confusion when I return to code I have previously written.