There are several options for graphically comparing proportions. Many of these, such as pie graphs and stacked bar charts, make comparisons between two different population proportions difficult. In this document, I present a proportional circle graph using ggplot that makes comparisons between proportions in two different populations easier.
First, load the required packages.
require(ggplot2) # Primary plotting package
require(cowplot) # Used for combining ggplot objects
require(RColorBrewer) # For additional colors
require(grid) # arrange grobs
require(gridExtra) # arrange grobs
require(plotrix) # cluster plotting
As an example, I have made a dataset of 6 source contributors to ambient particulate matter (PM) in 10 different sites. The dataframe is set to “long” format:
head(df)
## Site Source Proportion
## 1 Site 1 Source 1 0.02429485
## 2 Site 1 Source 2 0.05554860
## 3 Site 1 Source 3 0.14355765
## 4 Site 1 Source 4 0.16731885
## 5 Site 1 Source 5 0.24071562
## 6 Site 1 Source 6 0.36856443
tail(df)
## Site Source Proportion
## 55 Site 10 Source 1 0.21148834
## 56 Site 10 Source 2 0.14156087
## 57 Site 10 Source 3 0.04936692
## 58 Site 10 Source 4 0.31050098
## 59 Site 10 Source 5 0.16606038
## 60 Site 10 Source 6 0.12102251
Here is an example of a proportional circle plot for each site’s source distribution:
ggplot(aes(fill=Source,label=Site,size=Proportion,y=c(.8,1.2,.9,1.1,.8,1.2),x=1:6),
data=df[df$Site=="Site 1",])+
geom_jitter(shape=21,width = .00,height = .1,alpha=1)+
scale_size_area(
limits=c(0,1),max_size = 32
)+
scale_fill_manual(values=brewer.pal(6,"Set1"))+
ylim(.6,1.4)+xlim(0,7)+
xlab(label=df[,"Site"])+
theme(
axis.ticks=element_blank(),
axis.title.y=element_blank(),
axis.text=element_blank()
)+
guides(fill=FALSE,size=F)
If you remove geom jitter for geom point, you can specify the locations of your circles precisely to make them closer. I personally prefer the slight randomness of geom jitter. The above may be automated for each source:
for(i in 1:10){
temp <- ggplot(aes(fill=Source,label=Site,size=Proportion,y=c(.8,1.2,.9,1.1,.8,1.2),x=1:6),
data=df[df$Site==paste("Site",i),])+
geom_jitter(shape=21,width = .00,height = .1,alpha=1)+
scale_size_area(
limits=c(0,1),max_size = 32
)+
scale_fill_manual(values=brewer.pal(6,"Set1"))+
ylim(.6,1.4)+xlim(0,7)+
xlab(label=df[df$Site==paste("Site",i),"Site"])+
theme(
axis.ticks=element_blank(),
axis.title.y=element_blank(),
axis.text=element_blank()
)+
guides(fill=FALSE,size=F)
assign(x = paste("Figure",i,sep="_"),temp)
}
Since the legend is the same for each plot, we can extract just the legend and avoid redundancy when combining plots per this article
leg<-ggplot(data = df, aes(x = Source,y=Proportion,fill=Source)) + geom_point(shape=21,size=8) +
scale_fill_manual("",labels=paste("Source",1:6),values=brewer.pal(6,"Set1"))+
theme(
legend.title=element_blank(),
legend.text=element_text(size=15)
)
g_legend<-function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
return(legend)}
legend <- g_legend(leg)
grid.draw(legend)
Now we just need to arrange these ggplot objects using cow.plot.
ggdraw()+
geom_rect(aes(xmin = 0, xmax =1, ymin = 0, ymax = 1),
colour = "black", fill = "white",size=.8)+
draw_plot(Figure_1,.33,.75,.33,.25)+
draw_plot(Figure_2,.66,.75,.33,.25)+
draw_plot(Figure_3,0,.5,.33,.25)+
draw_plot(Figure_4,.33,.5,.33,.25)+
draw_plot(Figure_5,.66,.5,.33,.25)+
draw_plot(Figure_6,0,.25,.33,.25)+
draw_plot(Figure_7,.33,.25,.33,.25)+
draw_plot(Figure_8,.66,.25,.33,.25)+
draw_plot(Figure_9,0.12,0,.33,.25)+
draw_plot(Figure_10,0.33+.21,0,.33,.25)+
draw_plot(legend,0,0.805,.33,.15)