One of the most common plots in science shows the mean value of a sample of data, and some kind of measurement of variablity using error bars. Typically, the error bars are defined as +/- 1 standard error (SE), or are 95% confidence intervals (95% CI)
There are several ways to build this type of plot in R, depending on your starting point and the exact approach you chose to use. I’ll walk you through just one approach.
I have calcualted some means and standard errors (SEs) from the “mammals” data
(Note that the “body size” is in kg, and the brain is in grams! Because these 2 body parts are ver different in size its it a bit silly to make a plot like this; this exercise is for illustration only!)
Let’s take this data and turn it into a simple dataframe. First, we’ll put the 2 mean measurements together into a “vector” using the c() command.
#assign means to a vector
#this is a numeric vector
mean.values <- c(198.79, 283.13) #must have comma!
Then we’ll put the two SE (standard error) values together
#Assign SE to vector
SE.values <- c(114.1932, 118.15)#must have comma!
We’ll also make a set of labels
#this is a character vector
body.part <- c("Body","Brain")
And, so we remember that the units are different, lets make a vector for that info
#another character vector
units <- c("g","kg")
Now, let’s turn thse vectors into a simple dataset called “df” for “data.frame”. We’ll use the data.frame() command.
mam.df <- data.frame(body.part,mean.values, SE.values,units)
See what this dataframe looks like
mam.df
## body.part mean.values SE.values units
## 1 Body 198.79 114.1932 g
## 2 Brain 283.13 118.1500 kg
We’ll plot these data using a function called errbar() (for “error bar”) in the Hmisc package. This will require a bit of code, but it will be good practice for reading a somewhat dense piece of R code.
Key to this code are 2 arguments: yplus and yminus. These define how far above the mean value and how far below the mean value the error bars extend.
library(Hmisc)
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
##
## format.pval, round.POSIXt, trunc.POSIXt, units
## Make the plot using errbar()
##plot means with Hmisc::errbar
errbar(x = c(1,2), #x = sets the number of groups to plot on the x axis
y = mam.df$mean.values, #y = sets the location of the means
yplus = mam.df$mean.values + mam.df$SE.values, #location of top of the error bar
yminus = mam.df$mean.values-mam.df$SE.values #location of the bottom of the error bar
)
This plot is, frankly, not very pretty. So we’ll tweak it to make it look better.
We’ll do 2 things. 1st, we’ll remove the labels from the axis, which are being shown as numbers even though we have 2 discrete categories. This is done with the arguement in errbar xaxt=“n”, which means “x-axis = no, turn it off”.
We’ll also change the scale of the x-axis to better center the points using the xlim arugment to set the upper and lower limits of the x axis. (Don’t worry, At this stage, its not essential that you understand what exactly is going on here.)
errbar(x = c(1,2), #x = ... sets the number of groups to plot on the x axis
y = mam.df$mean.values, #y = ... sets the location of the means
yplus = mam.df$mean.values + mam.df$SE.values, #location of top of the error bar
yminus = mam.df$mean.values-mam.df$SE.values, #location of the bottom of the error bar
xlim=c(0.5,2.5), #set x-axis limits
xaxt="n")
This looks a little better. As before, we can labels the axes using the xlab and ylab agruements.
#make the plot
errbar(x = c(1,2),
y = mam.df$mean.values,
yplus = mam.df$mean.values + mam.df$SE.values,
yminus = mam.df$mean.values-mam.df$SE.values,
xlab = "Body Part",
ylab = "Mass",
xlim=c(0.5,2.5),
xaxt="n")
Finally, we need to label what the 2 points are. This is done with a special command, “axis”. Again, don’t worry about all of the details right now.
#make the plot
errbar(x = c(1,2),
y = mam.df$mean.values,
yplus = mam.df$mean.values + mam.df$SE.values,
yminus = mam.df$mean.values-mam.df$SE.values,
xlab = "Body Part",
ylab = "Mass",
xlim=c(0.5,2.5),
xaxt="n")
#define the axis and labels
axis(side=1, #1 = the bottom
at=c(1,2), #where on x-axis
labels=mam.df$body.part) #what to print
To make things a bit more fancy we can a some annotations to the plot using the legend() command
errbar(x = c(1,2),
y = mam.df$mean.values,
yplus = mam.df$mean.values + mam.df$SE.values,
yminus = mam.df$mean.values-mam.df$SE.values,
xlab = "Body Part",
ylab = "Mass",
xlim=c(0.5,2.5),
xaxt="n")
axis(side=1, #1 = the bottom
at=c(1,2), #where on x-axis
labels=mam.df$body.part) #what to print
legend("topleft", legend = "Error bars = SE",bty = "n")
legend("bottomleft", legend = "n = 62 spp",bty = "n")