Assignment information: (delete this when you submit) In this assignment you will re-build the pie graphs shown in the paper “Genomics is failing on diversity” by Popejoy and Fullerton (https://www.nature.com/articles/538161a). Delete all instructions and replace with short explanatory text about all code chunks. Be sure to change the title in the YAML header.
If possible, save this file to your Teams folder.
Write a brief introduction about the data being plotted, including the information
This should be about 4-5 sentences. The data being plotted in is collected and funded by the US National Institutes of Health (NIH) in 2009. The data was collected by taking samples from extraction and validation. This has to happen before adding them to the GWAS Catalog that is produced by the US National Human Genome Research Institute in partnership with the European Bioinformatics Institute. A GWAS Catalog approach was used in the 2016 study by gathering more samples to analysis. The same process that was done in 2009 was repeated in 2016 because the number of samples of European ancestry used of GWAS could come from a number of actual individuals and if the European-ancestry data sets are resampled more often than other, it would reflect population specific difference in research effort.
Create vectors to contain the data and labels to make the pie graphs at the top of figures.
Each vector has 3 elements: European ancestry, Asian ancestry, and other non-European ancestry.
DO NOT name your vector for the labels “labels”, since this is the name of an existing R function.
Include new line characters in the text as needed to improve spacing.
euro_non_euro1 <-c(96, 3, 1)
euro_non_euro2 <-c(81, 14, 5)
labels1 <-c("European ancestry","Asian Ancestry", "Other Non- European\nAncestry")
labels2<- c("European\nancestry","Asian\nAncestry", "Other\nNon- European\nAncestry")
Pro Tip: adding a new line character in front of the text or behind it in your labels and help you adjust spacing. E.g. “European” or “” (note - if you don’t delete this instruction the preceding text will have some weird features.)
# set up par()
par(mfrow = c(1,2), mar = c(2,3,1,5))
#pie graphs 1
# add main, init.angle, radius, and col
pie(x = euro_non_euro1,init.angle = -82, radius = 1 , col = c(1,2,3), labels = labels1, main = "2009" )
# pie graph 2
# add main, init.angle, radius, and col
pie(x = euro_non_euro2, labels = labels2, main = "2016")
par(mfrow = c(1,2), mar = c(1,1,1,1)) #graph two graphs at the same time
If you want, you can examine this code below to see how stacked bar graphs are made
# data
dat2009 <- c(3, 0.57,0.15,0.06,0.06)
dat2009_rev <- rev(dat2009)
barplotdata2009 <- matrix(c(dat2009_rev))
# labels
labels_x <- rev(c("Asian","African","Mixed", "Hispanic &\nLatin American",
"Pacific Islander","Arab & Middle East","Native peoples"))
par(mfrow = c(1,1))
barplot(barplotdata2009,
width = 0.01,
xlim = c(0,0.1),
axes = F,
col = c(1,2,3,4,5,6,7),
legend.text = labels_x)
# data
dat2016 <- c(14, 3,1,0.54,0.28,0.08,0.05)
dat2016_rev <- rev(dat2016)
barplotdata2016 <- matrix(c(dat2016_rev))
# labels
labels_x <- rev(c("Asian","African","Mixed", "Hispanic &\nLatin American",
"Pacific Islander","Arab & Middle East","Native peoples"))
par(mfrow = c(1,1))
barplot(barplotdata2016,
width = 0.01,
xlim = c(0,0.1),
axes = F,
col = c(1,2,3,4,5,6,7),
legend.text = labels_x)