I like beavers so I thought I’d take a look at the beavers dataset in the datasets R package. I’m just going to do some very preliminary sanity checks, like looking at the raw data and the structure of it as well as plotting it up.
Here’s the description of the dataset in the documentation:
Reynolds (1994) describes a small part of a study of the long-term temperature dynamics of beaver Castor canadensis in north-central Wisconsin. Body temperature was measured by telemetry every 10 minutes for four females, but data from a one period of less than a day for each of two animals is used there.
Just to make sure everyone remembers just how cute they are, I’ll include an image I found on the web:
So let’s load the data into the global environment.
data(beavers)
It looks like there’s two datasets loaded. You can check by using the function ls() which returns the name of all objects currently loaded in the global environment.
ls()
## [1] "beaver1" "beaver2"
For starters I’d like to look at the data and hopefully be able to combine it into one dataset. I’d like to look at it nicely formatted so I’ll use the kable function in knitr to print the two data.frames to html. I only need to have a look at a small subsection of each dataset so I’ll use function head to see the top few rows of each.
require(knitr)
kable(head(beaver1), caption = "beaver1 dataset")
| day | time | temp | activ |
|---|---|---|---|
| 346 | 840 | 36.33 | 0 |
| 346 | 850 | 36.34 | 0 |
| 346 | 900 | 36.35 | 0 |
| 346 | 910 | 36.42 | 0 |
| 346 | 920 | 36.55 | 0 |
| 346 | 930 | 36.69 | 0 |
kable(head(beaver2), caption = "beaver2 dataset")
| day | time | temp | activ |
|---|---|---|---|
| 307 | 930 | 36.58 | 0 |
| 307 | 940 | 36.73 | 0 |
| 307 | 950 | 36.93 | 0 |
| 307 | 1000 | 37.15 | 0 |
| 307 | 1010 | 37.23 | 0 |
| 307 | 1020 | 37.24 | 0 |
The names of the variables in both datasets match so I’ll go ahead and combine them into a single data.frame and have a look at the structure.
beaver <- rbind(data.frame(beaver1, id = "beaver1"),
data.frame(beaver2, id = "beaver2"))
str(beaver)
## 'data.frame': 214 obs. of 5 variables:
## $ day : num 346 346 346 346 346 346 346 346 346 346 ...
## $ time : num 840 850 900 910 920 930 940 950 1000 1010 ...
## $ temp : num 36.3 36.3 36.4 36.4 36.5 ...
## $ activ: num 0 0 0 0 0 0 0 0 0 0 ...
## $ id : Factor w/ 2 levels "beaver1","beaver2": 1 1 1 1 1 1 1 1 1 1 ...
As a quick sanity check I’ll also have a look at the summary for the dataset:
summary(beaver)
## day time temp activ
## Min. :307.0 Min. : 0 Min. :36.33 Min. :0.0000
## 1st Qu.:307.0 1st Qu.:1030 1st Qu.:36.85 1st Qu.:0.0000
## Median :346.0 Median :1475 Median :36.99 Median :0.0000
## Mean :327.9 Mean :1375 Mean :37.21 Mean :0.3178
## 3rd Qu.:346.0 3rd Qu.:1920 3rd Qu.:37.64 3rd Qu.:1.0000
## Max. :347.0 Max. :2350 Max. :38.35 Max. :1.0000
## id
## beaver1:114
## beaver2:100
##
##
##
##
Finally I’m going to plot the data up to have a look at it. I’m using package ggplot2 and going to plot temperature as a function of time. I’ll also colour the points according to beaver identity.
require(ggplot2)
ggplot(beaver, aes(x = time, y = temp, colour = id)) + geom_point()