Plotting means

One of the most common plots in science shows the mean value of a sample of data, and some kind of measurement of variablity using error bars. Typically, the error bars are defined as +/- 1 standard error (SE), or are 95% confidence intervals (95% CI)

There are several ways to build this type of plot in R, depending on your starting point and the exact approach you chose to use. I’ll walk you through just one approach.











Mean values for mammal data

I have calcualted some means and standard errors (SEs) from the “mammals” data

Mammal data parameters

  • 198.79 = Mean body size of mammals in dataset (in kg)
  • 0.28 = mean brain weight (in grams!)
  • 114.19 = SE for body size
  • 0.12 = SE for brain
  • The data come from 62 species
  • SE = standard error, a measure of variation


(Note that the “body size” is in kg, and the brain is in grams! Because these 2 body parts are ver different in size its it a bit silly to make a plot like this; this exercise is for illustration only!)


Let’s take this data and turn it into a simple dataframe. First, we’ll put the 2 mean measurements together into a “vector” using the c() command.

#assign means to a vector
#this is a numeric vector
mean.values <- c(198.79, 283.13) #must have comma!


Then we’ll put the two SE (standard error) values together

#Assign SE to vector
SE.values <- c(114.1932, 118.15)#must have comma!


We’ll also make a set of labels

#this is a character vector
body.part <- c("Body","Brain")


And, so we remember that the units are different, lets make a vector for that info

#another character vector
units <- c("g","kg")



Now, let’s turn thse vectors into a simple dataset called “df” for “data.frame”. We’ll use the data.frame() command.

mam.df <- data.frame(body.part,mean.values, SE.values,units)


See what this dataframe looks like

mam.df
##   body.part mean.values SE.values units
## 1      Body      198.79  114.1932     g
## 2     Brain      283.13  118.1500    kg


Plotting means using errbar()

We’ll plot these data using a function called errbar() (for “error bar”) in the Hmisc package. This will require a bit of code, but it will be good practice for reading a somewhat dense piece of R code.
Key to this code are 2 arguments: yplus and yminus. These define how far above the mean value and how far below the mean value the error bars extend.

Basic plot of mammal means

Load the Hmisc package

library(Hmisc)
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
## 
##     format.pval, round.POSIXt, trunc.POSIXt, units


## Make the plot using errbar()

##plot means with Hmisc::errbar
errbar(x = c(1,2), #x =  sets the number of groups to plot on the x axis
       y = mam.df$mean.values, #y = sets the location of the means
       yplus = mam.df$mean.values + mam.df$SE.values, #location of top of the error bar
       yminus = mam.df$mean.values-mam.df$SE.values #location of the bottom of the error bar
       )



This plot is, frankly, not very pretty. So we’ll tweak it to make it look better.

We’ll do 2 things. 1st, we’ll remove the labels from the axis, which are being shown as numbers even though we have 2 discrete categories. This is done with the arguement in errbar xaxt=“n”, which means “x-axis = no, turn it off”.

We’ll also change the scale of the x-axis to better center the points using the xlim arugment to set the upper and lower limits of the x axis. (Don’t worry, At this stage, its not essential that you understand what exactly is going on here.)

errbar(x = c(1,2), #x = ... sets the number of groups to plot on the x axis
       y = mam.df$mean.values, #y = ... sets the location of the means
       yplus = mam.df$mean.values + mam.df$SE.values, #location of top of the error bar
       yminus = mam.df$mean.values-mam.df$SE.values, #location of the bottom of the error bar
       xlim=c(0.5,2.5), #set x-axis limits
       xaxt="n")


This looks a little better. As before, we can labels the axes using the xlab and ylab agruements.

#make the plot
errbar(x = c(1,2), 
       y = mam.df$mean.values,
       yplus = mam.df$mean.values + mam.df$SE.values, 
       yminus = mam.df$mean.values-mam.df$SE.values, 
       xlab = "Body Part",
       ylab = "Mass",
       xlim=c(0.5,2.5),
       xaxt="n")



Label the axes of the errbar plot

Finally, we need to label what the 2 points are. This is done with a special command, “axis”. Again, don’t worry about all of the details right now.

#make the plot
errbar(x = c(1,2), 
       y = mam.df$mean.values, 
       yplus = mam.df$mean.values + mam.df$SE.values, 
       yminus = mam.df$mean.values-mam.df$SE.values, 
       xlab = "Body Part",
       ylab = "Mass",
       xlim=c(0.5,2.5),
       xaxt="n")

#define the axis and labels
axis(side=1,     #1 = the bottom
     at=c(1,2),  #where on x-axis
     labels=mam.df$body.part)  #what to print


Annotate plot w/ legend() command

To make things a bit more fancy we can a some annotations to the plot using the legend() command

errbar(x = c(1,2), 
       y = mam.df$mean.values, 
       yplus = mam.df$mean.values + mam.df$SE.values, 
       yminus = mam.df$mean.values-mam.df$SE.values, 
       xlab = "Body Part",
       ylab = "Mass",
       xlim=c(0.5,2.5),
       xaxt="n")

axis(side=1,     #1 = the bottom
     at=c(1,2),  #where on x-axis
     labels=mam.df$body.part)  #what to print

legend("topleft", legend = "Error bars = SE",bty = "n") 
legend("bottomleft", legend = "n = 62 spp",bty = "n")