Introduction

Alice B. Popejoy and Stephanie M. Fullerton have done a study on the lack of cultures represented in genetic studies. The amount of data that has been collected to study people of European descent has been largely disproportionate to other cultures and continents around the world. They collected this data by looking at the sample descriptions in the catalogued GWAS studies. There ended up being about 35 million samples and they took note of the ancestry of these participants. They repeated this process again in 2016 so that they could compare results and see if any improvements had been made in diversifying genetic studies.

Assigning the data and labels to variables

diversity_data_2009 <- c(96, 3, 1)
diversity_label_2009 <- c("European ancestry\n", "\n\nAsian\n ancestry", "\n\nother\n non-European")
diversity_data_2016 <- c(81, 14, 4)
diversity_label_2016 <- c("European ancestry\n", "\n\nAsian\n ancestry", "\n\nother\n non-European")

Persistent Bias

These pie graphs show the persitent bias that has occured in genetic studies. There has been slight improvement from 2009 to 2016, but we still have a long way to go to truly represent the entirety of the Earth’s cultures.

# set up par()
par(mfrow = c(1,2), mar = c(2,3,1,5))

#pie graphs 1
# add main, init.angle, radius, and col
pie(diversity_data_2009,
    labels = diversity_label_2009,
    main = "2009",
    init.angle = -82,
    radius = 1,
    col = c(1, 5, 2))

# pie graph 2
# add main, init.angle, radius, and col
pie(diversity_data_2016,
    labels = diversity_label_2016,
    main = "2016",
    init.angle = -58,
    radius = 1,
    col = c(1, 5, 2))

Cultural Distribution

This is a more specific breakdown of the 2016 data.

# data
dat2016 <- c(14, 3,1,0.54,0.28,0.08,0.05)
dat2016_rev <- rev(dat2016)
barplotdata2016 <- matrix(c(dat2016_rev))

# labels
labels_x <- rev(c("Asian","African","Mixed", "Hispanic &\nLatin American",
                        "Pacific Islander","Arab & Middle East","Native peoples"))

par(mfrow = c(1,1))

barplot(barplotdata2016,
        width = 0.01, 
        xlim = c(0,0.1),
         axes = F,
        col = c(1,2,3,4,5,6,7),
        legend.text = labels_x)