Playing around with beavers datasets

I like beavers so I thought I’d take a look at the beavers dataset in the datasets R package. I’m just going to do some very preliminary sanity checks, like looking at the raw data and the structure of it as well as plotting it up.

Here’s the description of the dataset in the documentation:

Reynolds (1994) describes a small part of a study of the long-term temperature dynamics of beaver Castor canadensis in north-central Wisconsin. Body temperature was measured by telemetry every 10 minutes for four females, but data from a one period of less than a day for each of two animals is used there.

Just to make sure everyone remembers just how cute they are, I’ll include an image I found on the web:



Load data

So let’s load the data into the global environment.

data(beavers)

It looks like there’s two datasets loaded. You can check by using the function ls() which returns the name of all objects currently loaded in the global environment.

ls()
## [1] "beaver1" "beaver2"



Look at the data

For starters I’d like to look at the data and hopefully be able to combine it into one dataset. I’d like to look at it nicely formatted so I’ll use the kable function in knitr to print the two data.frames to html. I only need to have a look at a small subsection of each dataset so I’ll use function head to see the top few rows of each.

require(knitr)
kable(head(beaver1), caption = "beaver1 dataset")
beaver1 dataset
day time temp activ
346 840 36.33 0
346 850 36.34 0
346 900 36.35 0
346 910 36.42 0
346 920 36.55 0
346 930 36.69 0
kable(head(beaver2), caption = "beaver2 dataset")
beaver2 dataset
day time temp activ
307 930 36.58 0
307 940 36.73 0
307 950 36.93 0
307 1000 37.15 0
307 1010 37.23 0
307 1020 37.24 0

The names of the variables in both datasets match so I’ll go ahead and combine them into a single data.frame and have a look at the structure.

beaver <- rbind(data.frame(beaver1, id = "beaver1"), 
                data.frame(beaver2, id = "beaver2"))          

str(beaver)
## 'data.frame':    214 obs. of  5 variables:
##  $ day  : num  346 346 346 346 346 346 346 346 346 346 ...
##  $ time : num  840 850 900 910 920 930 940 950 1000 1010 ...
##  $ temp : num  36.3 36.3 36.4 36.4 36.5 ...
##  $ activ: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ id   : Factor w/ 2 levels "beaver1","beaver2": 1 1 1 1 1 1 1 1 1 1 ...



summarise the data

As a quick sanity check I’ll also have a look at the summary for the dataset:

summary(beaver)
##       day             time           temp           activ       
##  Min.   :307.0   Min.   :   0   Min.   :36.33   Min.   :0.0000  
##  1st Qu.:307.0   1st Qu.:1030   1st Qu.:36.85   1st Qu.:0.0000  
##  Median :346.0   Median :1475   Median :36.99   Median :0.0000  
##  Mean   :327.9   Mean   :1375   Mean   :37.21   Mean   :0.3178  
##  3rd Qu.:346.0   3rd Qu.:1920   3rd Qu.:37.64   3rd Qu.:1.0000  
##  Max.   :347.0   Max.   :2350   Max.   :38.35   Max.   :1.0000  
##        id     
##  beaver1:114  
##  beaver2:100  
##               
##               
##               
## 

plot some data

Finally I’m going to plot the data up to have a look at it. I’m using package ggplot2 and going to plot temperature as a function of time. I’ll also colour the points according to beaver identity.

require(ggplot2)
ggplot(beaver,  aes(x = time, y = temp, colour = id)) + geom_point()