There are several options for graphically comparing proportions. Many of these, such as pie graphs and stacked bar charts, make comparisons between two different population proportions difficult. In this document, I present a proportional circle graph using ggplot that makes comparisons between proportions in two different populations easier.

First, load the required packages.

require(ggplot2) # Primary plotting package
require(cowplot) # Used for combining ggplot objects
require(RColorBrewer) # For additional colors
require(grid) # arrange grobs
require(gridExtra) # arrange grobs
require(plotrix) # cluster plotting

As an example, I have made a dataset of 6 source contributors to ambient particulate matter (PM) in 10 different sites. The dataframe is set to “long” format:

head(df)
##     Site   Source Proportion
## 1 Site 1 Source 1 0.02429485
## 2 Site 1 Source 2 0.05554860
## 3 Site 1 Source 3 0.14355765
## 4 Site 1 Source 4 0.16731885
## 5 Site 1 Source 5 0.24071562
## 6 Site 1 Source 6 0.36856443
tail(df)
##       Site   Source Proportion
## 55 Site 10 Source 1 0.21148834
## 56 Site 10 Source 2 0.14156087
## 57 Site 10 Source 3 0.04936692
## 58 Site 10 Source 4 0.31050098
## 59 Site 10 Source 5 0.16606038
## 60 Site 10 Source 6 0.12102251

Here is an example of a proportional circle plot for each site’s source distribution:

ggplot(aes(fill=Source,label=Site,size=Proportion,y=c(.8,1.2,.9,1.1,.8,1.2),x=1:6),
       data=df[df$Site=="Site 1",])+
  geom_jitter(shape=21,width = .00,height = .1,alpha=1)+
  scale_size_area(
  limits=c(0,1),max_size = 32
  )+
  scale_fill_manual(values=brewer.pal(6,"Set1"))+
  ylim(.6,1.4)+xlim(0,7)+
  xlab(label=df[,"Site"])+
  theme(
      axis.ticks=element_blank(),
      axis.title.y=element_blank(),
      axis.text=element_blank()
  )+
  guides(fill=FALSE,size=F)

If you remove geom jitter for geom point, you can specify the locations of your circles precisely to make them closer. I personally prefer the slight randomness of geom jitter. The above may be automated for each source:

for(i in 1:10){
temp <- ggplot(aes(fill=Source,label=Site,size=Proportion,y=c(.8,1.2,.9,1.1,.8,1.2),x=1:6),
       data=df[df$Site==paste("Site",i),])+
  geom_jitter(shape=21,width = .00,height = .1,alpha=1)+
  scale_size_area(
  limits=c(0,1),max_size = 32
  )+
  scale_fill_manual(values=brewer.pal(6,"Set1"))+
  ylim(.6,1.4)+xlim(0,7)+
  xlab(label=df[df$Site==paste("Site",i),"Site"])+
  theme(
      axis.ticks=element_blank(),
      axis.title.y=element_blank(),
      axis.text=element_blank()
  )+
  guides(fill=FALSE,size=F)
  assign(x = paste("Figure",i,sep="_"),temp)
}

Since the legend is the same for each plot, we can extract just the legend and avoid redundancy when combining plots per this article

leg<-ggplot(data = df, aes(x = Source,y=Proportion,fill=Source)) + geom_point(shape=21,size=8) +
  scale_fill_manual("",labels=paste("Source",1:6),values=brewer.pal(6,"Set1"))+
  theme(
    legend.title=element_blank(),
    legend.text=element_text(size=15)
  )

g_legend<-function(a.gplot){ 
  tmp <- ggplot_gtable(ggplot_build(a.gplot)) 
  leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box") 
  legend <- tmp$grobs[[leg]] 
  return(legend)} 

legend <- g_legend(leg) 
grid.draw(legend)

Now we just need to arrange these ggplot objects using cow.plot.

ggdraw()+
  geom_rect(aes(xmin = 0, xmax =1, ymin = 0, ymax = 1),
        colour = "black", fill = "white",size=.8)+
  draw_plot(Figure_1,.33,.75,.33,.25)+
  draw_plot(Figure_2,.66,.75,.33,.25)+
  draw_plot(Figure_3,0,.5,.33,.25)+
  draw_plot(Figure_4,.33,.5,.33,.25)+
  draw_plot(Figure_5,.66,.5,.33,.25)+
  draw_plot(Figure_6,0,.25,.33,.25)+
  draw_plot(Figure_7,.33,.25,.33,.25)+
  draw_plot(Figure_8,.66,.25,.33,.25)+
  draw_plot(Figure_9,0.12,0,.33,.25)+
  draw_plot(Figure_10,0.33+.21,0,.33,.25)+
  draw_plot(legend,0,0.805,.33,.15)