Assignment information: (delete this when you submit) In this assignment you will re-build the pie graphs shown in the paper “Genomics is failing on diversity” by Popejoy and Fullerton (https://www.nature.com/articles/538161a). Delete all instructions and replace with short explanatory text about all code chunks. Be sure to change the title in the YAML header.

If possible, save this file to your Teams folder.

Introduction

This pie chart is a recreation of the one from the study conducted by Alice B. Popejoy and Stephanie M. Fullerton. The study analyzed the different heritages that were represented in genome-wide association studies (GWAS). It showed that the vast majority of participants in GWAS were of European heritage. The study was conducted in 2009, and it was repeated in 2016, showing how the proportions changed over time. In 2016, it was observed that a smaller proportion of participants were of European heritage, compared to 2009, and there was increased representation of Asian and other non-European groups.

Create data

This assigns the data from study into 2 vectors: one for the 2009 study and one for the 2016 study. The data is given as percentages of each ancestry in the study, only including European, Asian, and other non-European. Another vector was made to label the ancestries.

data2009 <- c(96,3,1)
data2016 <- c(81,14,5)
labels_ancestry <- c("European","Asian","Other")

Pie graphs

The first part in this section sets the graphical parameters for setting up the 2 pie charts. The next 2 parts actually create the pie graphs for each year. Each pie function() is given arguments for data, labels, a title, starting angle/location for the graph, radius of the chart, and colors for each group.

# set up par()
par(mfrow = c(1,2), mar = c(2,3,1,5))

#pie graphs 1
# add main, init.angle, radius, and col
pie(data2009, labels = labels_ancestry, main = "2009", init.angle = -82, radius = 1, col = c(1,2,3))

# pie graph 2
# add main, init.angle, radius, and col
pie(data2016, labels = labels_ancestry, main = "2016", init.angle = -82, radius = 1, col = c(1,2,3))

Bar graphs

This section makes a stacked bar graph to show the proportions of different non-European heritages in the 2016 study.

# data
dat2016 <- c(14, 3,1,0.54,0.28,0.08,0.05)
dat2016_rev <- rev(dat2016)
barplotdata2016 <- matrix(c(dat2016_rev))

# labels
labels_x <- rev(c("Asian","African","Mixed", "Hispanic &\nLatin American",
                        "Pacific Islander","Arab & Middle East","Native peoples"))

par(mfrow = c(1,1))

barplot(barplotdata2016,
        width = 0.01, 
        xlim = c(0,0.1),
         axes = F,
        col = c(1,2,3,4,5,6,7),
        legend.text = labels_x)