The first time you try to plot a barchart in ggplot with two bars side by side, it may not be immediately obvious how you should do this.
The trick is to use “long” format data with one column containing the data for the two bars we wish to plot.
library(tidyr) # For converting our data to long format
library(ggplot2) # For creating the bar chart
df <- read.csv("data.csv") # read the data
df # Take a look at the data
## Date X Y
## 1 Jan 16 43 24
## 2 Feb 16 25 35
## 3 Mar 16 36 27
## 4 Apr 16 37 45
## 5 May 16 39 27
## 6 Jun 16 37 51
## 7 Jul 16 43 37
## 8 Aug 16 44 41
## 9 Sep 16 36 30
## 10 Oct 16 43 33
## 11 Nov 16 35 18
## 12 Dec 16 23 22
We can see that here the data is a Date column followed by two labels, X and Y.
The Date column is actually a character factor so before we start to tidy the data, let's add an actual Date object column. ggplot will know how to order the x axis if we pass it a date object.
df$Date2 <- as.Date(x=paste("1 ",df$Date, sep=""), format="%d %b %y") #convert to date format.
df
## Date X Y Date2
## 1 Jan 16 43 24 2016-01-01
## 2 Feb 16 25 35 2016-02-01
## 3 Mar 16 36 27 2016-03-01
## 4 Apr 16 37 45 2016-04-01
## 5 May 16 39 27 2016-05-01
## 6 Jun 16 37 51 2016-06-01
## 7 Jul 16 43 37 2016-07-01
## 8 Aug 16 44 41 2016-08-01
## 9 Sep 16 36 30 2016-09-01
## 10 Oct 16 43 33 2016-10-01
## 11 Nov 16 35 18 2016-11-01
## 12 Dec 16 23 22 2016-12-01
Now we are ready to tidy our data and convert the table to long format. To do this we will use the gather
function from the tidyr
package.
df <- gather(df, event, total, X:Y) #Create long format
df
## Date Date2 event total
## 1 Jan 16 2016-01-01 X 43
## 2 Feb 16 2016-02-01 X 25
## 3 Mar 16 2016-03-01 X 36
## 4 Apr 16 2016-04-01 X 37
## 5 May 16 2016-05-01 X 39
## 6 Jun 16 2016-06-01 X 37
## 7 Jul 16 2016-07-01 X 43
## 8 Aug 16 2016-08-01 X 44
## 9 Sep 16 2016-09-01 X 36
## 10 Oct 16 2016-10-01 X 43
## 11 Nov 16 2016-11-01 X 35
## 12 Dec 16 2016-12-01 X 23
## 13 Jan 16 2016-01-01 Y 24
## 14 Feb 16 2016-02-01 Y 35
## 15 Mar 16 2016-03-01 Y 27
## 16 Apr 16 2016-04-01 Y 45
## 17 May 16 2016-05-01 Y 27
## 18 Jun 16 2016-06-01 Y 51
## 19 Jul 16 2016-07-01 Y 37
## 20 Aug 16 2016-08-01 Y 41
## 21 Sep 16 2016-09-01 Y 30
## 22 Oct 16 2016-10-01 Y 33
## 23 Nov 16 2016-11-01 Y 18
## 24 Dec 16 2016-12-01 Y 22
Here we have combined our X and Y columns into a column called event and the counts for X and Y have be put into a column called total
Now we are ready to plot.
We use geom_bar
for the bar chart with position = 'dodge'
so the bars are not stacked.
plot <- ggplot(df, aes(Date2, total, fill=event))
plot <- plot + geom_bar(stat = "identity", position = 'dodge')
plot
Sometimes I like to use themes so here is the full code using the Five Thirty Eight theme from the ggthemes
package.
library(tidyr) # For converting our data to long format
library(ggplot2) # For creating the bar chart
library(ggthemes) # Five Thirty Eigth theme- Nate Silver goodness :-)
df <- read.csv("data.csv") # read the data
df$Date2 <- as.Date(x=paste("1 ",df$Date, sep=""), format="%d %b %y") #convert to date format.
df <- gather(df, event, total, X:Y) #Create long format
plot <- ggplot(df, aes(Date2, total, fill=event))
plot <- plot + geom_bar(stat = "identity", position = 'dodge', colour="black")
plot <- plot + guides(fill=guide_legend(title=NULL)) # remove legend title
plot <- plot + ggtitle("Number of Things per Month")
plot <- plot + theme_fivethirtyeight() + scale_fill_fivethirtyeight() # Nate Silver FTW
plot <- plot + labs(x="",y="# per Month")
plot