Graphs with ggplot2

install.packages(“tidyverse”) install.packages(“RColorBrewer”)

Start by attaching the packages you will use:

library(tidyverse)
## Registered S3 methods overwritten by 'tibble':
##   method     from  
##   format.tbl pillar
##   print.tbl  pillar
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.0.3     ✓ dplyr   1.0.6
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(readxl)
library(dplyr)
library(ggplot2)
library(RColorBrewer)

Reading Excel files

library(readxl)
datatable <- read_excel("testdata.xlsx", na = "nan")

Replacing NAs in the mean signal column with zeros

In NuncSpotA if a signal has a value of zero the output in the excel graph is “nan”.The function below turns the“nan”into zero For the remainder of the code I will be using %>% which is the pipe function.This means that every action linked by %>% will occur together.

datatable<-datatable %>% 
  mutate (mean_signal=replace_na(mean_signal, replace = 0))

Creating column for Background

  1. The mutate function creates new column where the background mean signal is put. In the circumstance where there is not “background” in the coloc_zone column (and instead a point: 1,2,3, etc.) then “NA’ is put in the column instead of the background mean signal value. This function reads”Make a column named background and when the coloc_zone reads “Background” put the associated mean signal, ELSE (aka TRUE) put NA. The fill function then brings down the background mean signal value to the NA cells.
    For this function it is important that your exported data has the background value above the test points (1,2,3,etc.)
datatable<-datatable %>%
  mutate(background=case_when(coloc_zone=="Background"~mean_signal,
                              TRUE~NA_real_) ) %>% 
  fill(background,.direction="down") %>% 
  mutate(norm_signal=mean_signal/background)

Filtering just to look at Point Signals

!= means does not equal This step will delete all the rows with "Background in the coloc_zone column.

datatable<-datatable %>% 
  filter(coloc_zone!="Background")

Creating a column that defines cell number

This step will allow you to later group by your name. Ex. later I group by Time point. You can name your new columns whatever names best suit you.

datatable<-datatable %>% 
  separate(file_name,into = c("nine","Stain","Time_Point","Cell_Number"),sep = "-")

Grouping by Timepoint

datatable %>% 
  group_by(Time_Point) %>% 
  summarize(MeanSig=mean(norm_signal),
            (norm_signal))
## `summarise()` has grouped output by 'Time_Point'. You can override using the `.groups` argument.
## # A tibble: 104 x 3
## # Groups:   Time_Point [5]
##    Time_Point MeanSig `(norm_signal)`
##    <chr>        <dbl>           <dbl>
##  1 18HPR        0.825           0.922
##  2 18HPR        0.825           0.899
##  3 18HPR        0.825           0.424
##  4 18HPR        0.825           0.447
##  5 18HPR        0.825           0.830
##  6 18HPR        0.825           0.631
##  7 18HPR        0.825           0.811
##  8 18HPR        0.825           0.706
##  9 18HPR        0.825           0.955
## 10 18HPR        0.825           0.610
## # … with 94 more rows

Creating a Graph

The first step I do here is order my x axis.
I then make a box plot that shows all of my points. Then I rename the x and y axis.

level_order <- c('1DPI', '6DPI', '8DPI','6HPR', '18HPR')
g = ggplot(datatable, aes(x=factor(Time_Point, level = level_order), y=norm_signal,
           fill=Time_Point))+ 
  geom_boxplot() + geom_point() + 
  scale_fill_brewer(palette="Set3") +
  xlab("Time Point") + ylab("Signal Intensity") +
  labs(title = "Signal Intensity of H3K27me3 at Viral Genome") 
  g