Assignment information: (delete this when you submit) In this assignment you will re-build the pie graphs shown in the paper “Genomics is failing on diversity” by Popejoy and Fullerton (https://www.nature.com/articles/538161a). Delete all instructions and replace with short explanatory text about all code chunks. Be sure to change the title in the YAML header.

If possible, save this file to your Teams folder.

Introduction

Write a brief introduction about the data being plotted, including the information

  1. who collected it,
  2. how it was collected,
  3. why the process was repeated in 2016

This should be about 4-5 sentences. The data being plotted in is collected and funded by the US National Institutes of Health (NIH) in 2009. The data was collected by taking samples from extraction and validation. This has to happen before adding them to the GWAS Catalog that is produced by the US National Human Genome Research Institute in partnership with the European Bioinformatics Institute. A GWAS Catalog approach was used in the 2016 study by gathering more samples to analysis. The same process that was done in 2009 was repeated in 2016 because the number of samples of European ancestry used of GWAS could come from a number of actual individuals and if the European-ancestry data sets are resampled more often than other, it would reflect population specific difference in research effort.

Create data

Create vectors to contain the data and labels to make the pie graphs at the top of figures.

Each vector has 3 elements: European ancestry, Asian ancestry, and other non-European ancestry.

DO NOT name your vector for the labels “labels”, since this is the name of an existing R function.

Include new line characters in the text as needed to improve spacing.

euro_non_euro1 <-c(96, 3, 1)
euro_non_euro2 <-c(81, 14, 5)


labels1 <-c("European ancestry","Asian Ancestry", "Other Non- European\nAncestry")
labels2<- c("European\nancestry","Asian\nAncestry", "Other\nNon- European\nAncestry")

Pro Tip: adding a new line character in front of the text or behind it in your labels and help you adjust spacing. E.g. “European” or “” (note - if you don’t delete this instruction the preceding text will have some weird features.)

Pie graphs

  1. Create a 1 x 2 grid using the command par(mfrow = c(1,2))
  2. Plot the 2009 data on the left and 2016 data on the right.
  3. This will require setting up the pie command twomce
  4. Use the argument main = … to add a title to above the plots
  5. Set the argument init.angle = … to -82. Experiment with how this affects the plot.
  6. Set the argument radius = … to 1. Experiment with how this affects the plot.
  7. Set the argument col = … to c(1,2,3), then experiment with different numbers. Try to make it ugly.
# set up par()
par(mfrow = c(1,2), mar = c(2,3,1,5))

#pie graphs 1
# add main, init.angle, radius, and col
pie(x = euro_non_euro1,init.angle = -82, radius = 1 , col = c(1,2,3), labels = labels1, main = "2009" )

# pie graph 2
# add main, init.angle, radius, and col
pie(x = euro_non_euro2, labels = labels2, main = "2016")

par(mfrow = c(1,2), mar = c(1,1,1,1)) #graph two graphs at the same time

Bar graphs

If you want, you can examine this code below to see how stacked bar graphs are made

# data
dat2009 <- c(3, 0.57,0.15,0.06,0.06)
dat2009_rev <- rev(dat2009)
barplotdata2009 <- matrix(c(dat2009_rev))

# labels
labels_x <- rev(c("Asian","African","Mixed", "Hispanic &\nLatin American",
                        "Pacific Islander","Arab & Middle East","Native peoples"))

par(mfrow = c(1,1))

barplot(barplotdata2009,
        width = 0.01, 
        xlim = c(0,0.1),
         axes = F,
        col = c(1,2,3,4,5,6,7),
        legend.text = labels_x)

# data
dat2016 <- c(14, 3,1,0.54,0.28,0.08,0.05)
dat2016_rev <- rev(dat2016)
barplotdata2016 <- matrix(c(dat2016_rev))

# labels
labels_x <- rev(c("Asian","African","Mixed", "Hispanic &\nLatin American",
                        "Pacific Islander","Arab & Middle East","Native peoples"))

par(mfrow = c(1,1))

barplot(barplotdata2016,
        width = 0.01, 
        xlim = c(0,0.1),
         axes = F,
        col = c(1,2,3,4,5,6,7),
        legend.text = labels_x)