Introduction

The data being plotted in these pie graphs shows the breakdown of the ancestry of all samples from genome-wide accession studies (GWAS), which was first determined in 2009 in a study by A.C. Need and D.B. Goldstein. GWAS are funded by the US National Institutes of Health (NIH), and the information, including descriptions about ancestry, is made public in the GWAS Catalog in PubMed. In the 2009 study, the authors analyzed this available ancestral data and found that there was an extreme lack in diversity of GWAS samples, with 96% being collected from people of European ancestry. This lack of diversity in GWAS leads to a lack in advancement of genomic medicine specific to people of non_European ancestry. So, the process was repeated in 2016 because authors Alice B. Popejoy and Stephanie M. Fullerton wanted to determine if the diversity in ancestry of GWAS changed from 2009 to include more subjects of non-European ancestry.

Create data

Here we create vectors to represent the ancestry percentages for both years being studied, as well as the labels for the pie graphs.

percent_2009 <- c(96, 3, 1)
percent_2016 <- c(81, 14, 5)

label_2009 <- c("European\nancestry\n", "\nAsian\nancestry", "\n\nOther\nnon-European\nancestry")
label_2016 <- c("European\nancestry\n", "\nAsian\nancestry", "\n\nOther\nnon-European\nancestry")

Pie graphs

Next, we will create the two separate pie graphs using the pie() function. We can change the radius, colors, and layout of the pie graph to maximize the readability. The settings that I have chosen were those that I thought made the graphs easiest to interpret and prevented the labels from overlapping.

# set up par()
par(mfrow = c(1,2), mar = c(2,3,1,5))

#pie graph 1- 2009 data
pie(x = percent_2009,
    labels = label_2009,
    main = "2009",
    init.angle = -85,
    radius = 1,
    col = c(7,4,10))

# pie graph 2- 2016 data
pie(x = percent_2016,
    labels = label_2016,
    main = "2016",
    init.angle = -70,
    radius = 1,
    col = c(7,4,10))