Today is our last day of our basic R introduction. We will be plotting our data using ggplot2!

Load Packages and Read Data

Similar to the last few weeks, let’s begin by loading the packages that we will need today.

library(tidyverse)
library(here)

Great. Now I will read in the data the same way we’ve done before.

#list files
files <- list.files(here::here(),full.names = TRUE)[1:3]

#read in data from files
worms <- purrr::map_dfr(files, ~readr::read_csv(.x))
First 4 rows of data
X1 Label Area Angle Length
1 p01-growth-H01-2X_B01.TIF 81 0.000 80.435
2 p01-growth-H01-2X_B01.TIF 5 0.000 3.875
3 p01-growth-H01-2X_B01.TIF 77 0.000 76.811
4 p01-growth-H01-2X_B01.TIF 6 36.027 5.101

Tidy and Process Data

And now I’m going to tidy and process the data into an appropriate format. Remember last week we did this in six steps but as I mentioned, we can use pipes (%>%) to accomplish this in one large block of code. This is what I did below.

tidydata <- worms %>%
  dplyr::mutate(row_num = row_number()) %>%
  dplyr::select(row_num, Label, Length) %>%
  tidyr::separate(Label, into=c("Plate", "Experiment", "Hour", "Magnification", "Well"), sep="[[:punct:]]") %>%
  dplyr::mutate(Hour = stringr::str_extract(Hour, pattern = "[:digit:]{2}")) %>%
  dplyr::group_by(Animal = rep(row_number(), length.out = n(), each = 2)) %>%
  dplyr::mutate_at(vars(row_num), ~dplyr::case_when(Length < 60 ~ "Width",
                                                    Length >= 60 ~ "Length")) %>%
  tidyr::pivot_wider(names_from = row_num, values_from = Length) %>%
  tidyr::separate(Well, into=c("Row","Column"), sep=c("(?<=[A-Za-z])(?=[0-9])")) %>%
  dplyr::mutate(Radius = Width/2,
                Volume = pi*Radius^2*Length,
                Area = 2*pi*Radius*Length + 2*pi*Radius) %>%
  dplyr::mutate(Length = 3.2937*Length,
                Width = 3.2937*Width,
                Radius = 3.2937*Radius,
                Volume = 3.2937*Volume,
                Area = 3.2937*Area)
First 4 rows of data
Plate Experiment Hour Magnification Row Column Animal Length Width Radius Volume Area
p01 growth 01 2X B 01 1 264.9288 12.76309 6.381544 3124.370 3265.252
p01 growth 01 2X B 01 2 252.9924 16.80116 8.400582 5170.208 4107.052
p01 growth 01 2X B 01 3 276.7070 16.73200 8.365998 5608.381 4468.613
p01 growth 01 2X B 01 4 231.9127 13.76108 6.880539 3179.445 3087.219

Doesn’t using pipes make things so much more streamlined?

Plotting with ggplot2

Now we will start plotting. This is a relatively straightforward process.

Setting up an empty plot

The first thing we need to do is tell R that we are setting up to plot. We do this by calling ggplot2::ggplot(). This will open up a blank plot to the right of your Rstudio session (under the Plots tab).

ggplot2::ggplot()

Designating data and x & y

Now we will add components to this blank canvas by using +.

The first thing I want to add are the aesthetics. This will tell R what information you want to plot.
Lets tell R which data we want to plot. I want to plot the data held in the variable tidydata.

ggplot2::ggplot(tidydata)

I will not actually execute this block but give it a try yourself. Notice that this does not actually add anything to our blank plot. We must first add aesthetics.

Let’s start by plotting Length. In this case we want x to be Hour and y to be Length

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length)

Notice now we have axes labels!

Adding geometric objects

The last thing we need to do is tell R what type of geometric object we want to plot. Let’s try plotting just simple points. We will use geom_point()

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length) + geom_point()

And there we have it. Now we have the basics we can play around with the aesthetics/geometric object.

Adding more aesthetics

Let’s start by adding to the aesthetics. Lets try changing the size of the points.

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length, size = 4) + geom_point()

By adding the designation for size to the aes() argument, notice that R puts this information in the plot legend. To avoid this we could place the size information directly in the geom_point() argument.

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length) + geom_point(size = 4)

Typically we place information in aes() when we want to use information thats in our dataframe (tidydata). For example, let’s say we want to color the points based on the Column they are from. In this case we would place color in aes(). If we placed it in the geom_point() argument, R would throw an error – feel free to try it out.

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length, color = Column) + geom_point(size = 4)

Layering geometric objects

Not only can we change the aesthetics but we can also change the geometric object we are plotting. Lets try making a boxplot rather than points. This is as simple as changing the last argument from geom_point() to geom_boxplot()

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length, color = Column) + geom_boxplot()

Because we still have the color defined in aes() notice that there is a separate boxplot for each Column. Try removing the color from aes() and see what happens.

So for our purposes I like to use a geometric object similar to geom_point() but that won’t result in points laying directly on top of each other. I like to use geom_jitter()…

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length) + geom_jitter()

Using geom_jitter() I also like to specify how much wiggle (or jitter) the points have. I like to keep their jitter pretty narrow. This can be changed by adding a width designation…

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length) + geom_jitter(width = 0.2)

We can also layer two geometric shapes on top of each other! Let’s try adding geom_boxplot() to geom_jitter(). This is as simple as tacking it on the end:

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length) + geom_boxplot() + geom_jitter(width = 0.2) 

However, the order that you add componenets to the plot will be the order they are added. As a personal preference I like having points in front of boxplots – this is why I add the boxplot first and the geom_jitter second.

Adding plot labels

The last thing I want to talk about is adding axes labels and titles to the plot. For this we use the argument labs.

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Length) + geom_boxplot() + geom_jitter(width = 0.2) + 
  labs(x = "Time (Hours)", y = "Animal Length (um)", title = "Animal Length over Time")

And that’s really about it. You can play around with what you want to plot in x and y, as well as aesthetics like size and color.
alpha just makes objects more transparent. The smaller the alpha the more transparent the object

ggplot2::ggplot(tidydata) + aes(x = Hour, y = Volume) + geom_boxplot(size = 0.5) + geom_jitter(size = 0.6, alpha = 0.8, width = 0.2) + 
  labs(x = "Time (Hours)", y = "Animal Volume", title = "Animal Volume over Time")

Putting it all together

And with these basics and a few extra things, that’s how I can make plots like this, with all of your data put together.