Introduction

The data being plotted is from a study that analyzes various populations that are represented in genome-wide association studies (GWAS). The study was conducted by Alice Popejoy and Stephanie Fullerton, and it is titled “Genomics is failing on diversity.” This study shows the breakdown and percentages of different populations participating in GWAS in 2009 vs 2016. During both years, the majority of participants are of European ancestry, but over time, there was increased representation of non-European groups, especially those of Asian ancestry. The process was repeated in 2016 to show the change in the proportions over time.

Create data

This creates three vectors: one with the proportions of ancestry in the 2009 study, one with the proportions of ancestry in the 2016 study, and one with the three ancestry categories that were specified in the paper.

data_2009 <- c(96,3,1)
data_2016 <- c(81,14,5)
ancestry <- c("European Ancestry","\n\nAsian\nAncestry","\n\nNon-European\nAncestry")

Pie graphs

The first line sets the parameters for the pie charts. The next few lines create the two pie charts, one for the 2009 data and the other for the 2016 data. The pie() function has arguments including data, labels, title (main), angle, radius of chart, and colors.

# set up par()
par(mfrow = c(1,2), mar = c(2,3,1,5))

#pie graphs 1
# add main, init.angle, radius, and col
pie(data_2009, labels = ancestry, main = "2009", init.angle = -82, radius = 1, col = c(1,5,4))

# pie graph 2
# add main, init.angle, radius, and col
pie(data_2016, labels = ancestry, main = "2016", init.angle = -55, radius = 1, col = c(1,5,4))

Bar graphs

Here, stacked bar graphs are created to show the proportions of the study in a different format.

# data
dat2016 <- c(14, 3,1,0.54,0.28,0.08,0.05)
dat2016_rev <- rev(dat2016)
barplotdata2016 <- matrix(c(dat2016_rev))

# labels
labels_x <- rev(c("Asian","African","Mixed", "Hispanic &\nLatin American",
                        "Pacific Islander","Arab & Middle East","Native peoples"))

par(mfrow = c(1,1))

barplot(barplotdata2016,
        width = 0.01, 
        xlim = c(0,0.1),
         axes = F,
        col = c(1,2,3,4,5,6,7),
        legend.text = labels_x)